Formula columns
Formula columns enable users to create columns whose values are automatically derived from other columns, such as a column of molecular weights calculated from a column of SMILES-encoded chemical structures. This page outlines how users can write formulas.
Cell values
Formulas are evaluated for a single row of a database. Within formulas, the values of the cells of this row can be accessed using the syntax row['<column_key>']
.
Constants
Constants are predefined values, such as Pi, that can be used in formulas. Constants make it easier to perform calculations and create complex formulas by providing users standard mathematical and scientific values. To use a constant in a formula, simply type its symbol.
Constants are case-sensitive.
Symbol(s) | Description | Value |
---|---|---|
e , E | Euler’s number, the base of the natural logarithm. | 2.718281828 |
i | Imaginary unit, defined as i i = -1. A complex number is described as a + b i, where a is the real part, and b is the imaginary part. | sqrt(-1) |
Infinity | Infinity, a number which is larger than the maximum number that can be handled by a floating point number. | Infinity |
LN2 | Returns the natural logarithm of 2. | 0.6931471806 |
LN10 | Returns the natural logarithm of 10. | 2.302585093 |
LOG2E | Returns the base-2 logarithm of E. | 1.442695041 |
LOG10E | Returns the base-10 logarithm of E. | 0.4342944819 |
NaN | Not a number. | NaN |
null | Value null. | null |
phi | Phi is the golden ratio. Two quantities are in the golden ratio if their ratio is the same as the ratio of their sum to the larger of the two quantities. Phi is defined as (1 + sqrt(5)) / 2 | 1.618033989 |
pi , PI | The number pi is a mathematical constant that is the ratio of a circle's circumference to its diameter. | 3.141592654 |
SQRT1_2 | Returns the square root of 1/2. | 0.7071067812 |
SQRT2 | Returns the square root of 2. | 1.414213562 |
tau | Tau is the ratio constant of a circle's circumference to radius, equal to 2 * pi. | 6.283185307 |
Operators
Operators are an essential component of formulas. They allow you to perform calculations, make comparisons, and combine logical statements. Our formulas support three types of operators:
- Arithmetic operators: Perform mathematical calculations on numeric values.
- Comparison operators: Compare two values and return a Boolean result (true or false).
- Logical operators: Combine multiple conditions and return a Boolean result (true or false).
Arithmetic operators
Symbol | Name | Syntax | Example formula | Example result |
---|---|---|---|---|
+ | Add | x + y | 4 + 5 | 9 |
+ | Unary plus | +y | 4 | 4 |
- | Subtract | x - y | 7 - 3 | 4 |
- | Unary minus | -y | -4 | -4 |
* | Multiply | x * y | 2 * 3 | 6 |
/ | Divide | x / y | 6 / 2 | 3 |
% | Percentage | x% | 8% | 0.08 |
% | Addition with percentage | x + y% | 100 + 3% | 103 |
% | Subtraction with percentage | x - y% | 100 - 3% | 97 |
% | Modulus | x % y | 8 % 3 | 2 |
^ | Power | x ^ y | 2 ^ 3 | 8 |
! | Factorial | y! | 5! | 120 |
Logical operators
Symbol | Name | Syntax | Example formula | Example result |
---|---|---|---|---|
and | And | x and y | true and false | false |
not | Not | not y | not true | false |
or | Or | x or y | true or false | true |
xor | Xor | x xor y | true xor true | false |
Comparison operators
Symbol | Name | Syntax | Example formula | Example result |
---|---|---|---|---|
== | Equal | x == y | 2 == 4 - 2 | true |
!= | Unequal | x != y | 2 != 3 | true |
< | Less than | x < y | 2 < 3 | true |
> | Greater than | x > y | 2 > 3 | false |
<= | Less than or equal to | x <= y | 4 <= 3 | false |
>= | Greater than or equal to | x >= y | 2 + 4 >= 6 | true |
? : | Conditional expression | x ? y : z | 15 > 100 ? 1 : -1 | -1 |
Functions
Formulas support a variety of functions for performing calculations and manipulating data. Furthermore, multiple functions can be combined or used with operators to create complex formulas such as add(abs(row['measurement 1']), ceil(cbrt(row['measurement 2'])))
.
To use a function in a formula:
- In the formula, type the function name followed its argument(s) in parentheses such as
functionName(arguments)
. - Inside the parentheses, specify the column(s) you want to use as arguments using the syntax
row['<column_key>']
. - For functions that accept multiple arguments, separate the arguments with commas.
Foundational functions
Function | Description | Example formula | Example result |
---|---|---|---|
abs(x) | Calculates the absolute value of a number in the specified column | abs(row['relative_fluorescence']) | abs(-5) = 5 |
add(x, y) | Adds the values from two specified columns | add(row['initial_mass'], row['final_mass']) | add(10, 5) = 15 |
cbrt(x) | Calculates the cubic root of a value in the specified column | cbrt(row['cell_volume']) | cbrt(27) = 3 |
ceil(x) | Rounds a value towards plus infinity | ceil(row['growth_rate']) | ceil(4.3) = 5 |
cube(x) | Computes the cube of a value in the specified column | cube(row['cell_diameter']) | cube(3) = 27 |
divide(x, y) | Divides the value from the first column by the value from the second column | divide(row['total_mass'], row['sample_count']) | divide(20, 4) = 5 |
exp(x) | Calculates the exponential of a value in the specified column | exp(row['log_intensity']) | exp(1) = 2.718 |
expm1(x) | Calculates the value of subtracting 1 from the exponential value | expm1(row['log_intensity']) | expm1(1) = 1.718 |
fix(x) | Rounds a value towards zero | fix(row['growth_rate']) | fix(4.7) = 4 |
floor(x) | Rounds a value towards minus infinity | floor(row['growth_rate']) | floor(4.7) = 4 |
gcd(x, y) | Calculates the greatest common divisor for values in two columns | gcd(row['value1'], row['value2']) | gcd(8, 12) = 4 |
hypot(x, y, ...) | Calculates the hypotenuse of values in specified columns | hypot(row['value1'], row['value2']) | hypot(3, 4) = 5 |
lcm(x, y) | Calculates the least common multiple for values in two columns | lcm(row['value1'], row['value2']) | lcm(4, 6) = 12 |
log(x [, base]) | Calculates the logarithm of a value in the specified column | log(row['intensity'], 2) | log(8, 2) = 3 |
log10(x) | Calculates the base-10 logarithm of a value | log10(row['intensity']) | log10(100) = 2 |
log1p(x) | Calculates the logarithm of a value + 1 i.e. log(x+1) | log1p(row['intensity']) | log1p(1) = 0.693 |
log2(x) | Calculates the base-2 logarithm of a value | log2(row['intensity']) | log2(8) = 3 |
mod(x, y) | Calculates the modulus (remainder) of the division of two values | mod(row['value1'], row['value2']) | mod(10, 3) = 1 |
multiply(x, y) | Multiplies values from two specified columns | multiply(row['length'], row['width']) | multiply(4, 5) = 20 |
nthRoot(x, n) | Calculates the nth root of a value | nthRoot(row['value'], 3) | nthRoot(27, 3) = 3 |
pow(x, y) | Calculates the power of one value to another | pow(row['base'], row['exponent']) | pow(2, 3) = 8 |
round(x [, n]) | Rounds a value to the nearest integer or specified decimal places | round(row['intensity'], 2) | round(4.567, 2) = 4.57 |
sign(x) | Computes the sign of a value | sign(row['intensity']) | sign(-5) = -1 |
sqrt(x) | Calculates the square root of a value | sqrt(row['intensity']) | sqrt(16) = 4 |
square(x) | Computes the square of a value | square(row['cell_diameter']) | square(3) = 9 |
subtract(x, y) | Subtracts the second value from the first value | subtract(row['final_mass'], row['initial_mass']) | subtract(10, 5) = 5 |
unaryMinus(x) | Inverses the sign of a value | unaryMinus(row['growth_rate']) | unaryMinus(5) = -5 |
unaryPlus(x) | Unary plus operation (returns the same value) | unaryPlus(row['growth_rate']) | unaryPlus(5) = 5 |
format(x [, precision]) | Format a value of any type into a string | format(row['value'], 2) | format(5.6789, 2) = "5.68" |
compare(x, y) | Compare two values in specified columns. Returns 1 if value_1 > value_2. Returns 0 if value_1=value_2. Returns -1 if value_1<value_2 | compare(row['length1'], row['length2']) | compare(5, 3) = 1 |
equal(x, y) | Test whether two values are equal | equal(row['mass1'], row['mass2']) | equal(5, 5) = true |
equalText(x, y) | Check equality of two strings | equalText(row['name1'], row['name2']) | equalText("apple", "apple") = true |
larger(x, y) | Test whether value in the first column is larger than in the second column | larger(row['mass1'], row['mass2']) | larger(10, 5) = true |
largerEq(x, y) | Test whether value in the first column is larger or equal to the value in the second column | largerEq(row['mass1'], row['mass2']) | largerEq(10, 10) = true |
smaller(x, y) | Test whether value in the first column is smaller than the value in the second column | smaller(row['mass1'], row['mass2']) | smaller(5, 10) = true |
smallerEq(x, y) | Test whether value in the first column is smaller or equal to the value in the second column | smallerEq(row['mass1'], row['mass2']) | smallerEq(5, 5) = true |
unequal(x, y) | Test whether two values are unequal | unequal(row['mass1'], row['mass2']) | unequal(5, 10) = true |
hasNumericValue(x) | Test whether a value is a numeric value | hasNumericValue(row['value']) | hasNumericValue(5) = true |
isInteger(x) | Test whether a value is an integer number | isInteger(row['value']) | isInteger(5) = true |
isNaN(x) | Test whether a value is NaN (not a number) | isNaN(row['value']) | isNan(5) = false |
isNegative(x) | Test whether a value is negative (smaller than zero) | isNegative(row['value']) | isNegative(-5) = true |
isNumeric(x) | Test whether a value is a numeric value | isNumeric(row['value']) | isNumeric(5) = true |
isPositive(x) | Test whether a value is positive (larger than zero) | isPositive(row['value']) | isPositive(5) = true |
isPrime(x) | Test whether a value is prime (has no divisors other than itself and one) | isPrime(row['value']) | isPrime(7) = true |
isZero(x) | Test whether a value is zero | isZero(row['value']) | isZero(0) = true |
Bioinformatics functions
Our formulas also support specialized functions for life science, such as for calculating properties of chemical structures and protein sequences. These functions enable you to perform calculations on your IP without sending your IP to external servers.
Function | Description | Example formula | Example result |
---|---|---|---|
protein_seq_molecular_weight(x) | Calculates the molecular weight of a protein sequence by summing the average atomic masses of the amino acids and accounting for water loss during peptide bond formation. | protein_seq_molecular_weight(row['sequence']) | Molecular weight in Daltons (Da) |
protein_seq_hydrophobicity(x) | Evaluates the overall hydrophobicity of a protein sequence using established hydrophobicity scales for each amino acid residue. | protein_seq_hydrophobicity(row['sequence']) | Hydrophobicity score |
protein_seq_isoelectric_point(x) | Determines the isoelectric point (pI) of a protein sequence, the pH at which the protein has no net charge. | protein_seq_isoelectric_point(row['sequence']) | pH value |
protein_seq_molar_extinction_coefficient_reduced(x) | Computes the molar extinction coefficient of a protein sequence under reducing conditions, considering only the contributions of free amino acids. | protein_seq_molar_extinction_coefficient_reduced(row['sequence']) | Molar extinction coefficient (M⁻¹cm⁻¹) |
protein_seq_molar_extinction_coefficient_cystines(x) | Calculates the molar extinction coefficient of a protein sequence in non-reducing conditions, accounting for disulfide bonds (cystines). | protein_seq_molar_extinction_coefficient_cystines(row['sequence']) | Molar extinction coefficient (M⁻¹cm⁻¹) |
Cheminformatics functions
Function | Description | Example |
---|---|---|
chem_amw | Atomic molecular weight | chem_amw(row['smiles']) |
chem_chi0n | chi0n | chem_chi0n(row['smiles']) |
chem_chi0v | chi0v | chem_chi0v(row['smiles']) |
chem_chi1n | chi1n | chem_chi1n(row['smiles']) |
chem_chi1v | chi1v | chem_chi1v(row['smiles']) |
chem_chi2n | chi2n | chem_chi2n(row['smiles']) |
chem_chi2v | chi2v | chem_chi2v(row['smiles']) |
chem_chi3n | chi3n | chem_chi3n(row['smiles']) |
chem_chi3v | chi3v | chem_chi3v(row['smiles']) |
chem_chi4n | chi4n | chem_chi4n(row['smiles']) |
chem_chi4v | chi4v | chem_chi4v(row['smiles']) |
chem_crippenClogP | Wildman-Crippen LogP value | chem_crippenClogP(row['smiles']) |
chem_crippenMr | Wildman-Crippen MR value | chem_crippenMr(row['smiles']) |
chem_exactMw | Exact molecular weight | chem_exactMw(row['smiles']) |
chem_fractionCSP3 | Fraction of C atoms that are SP3 hybridized | chem_fractionCSP3(row['smiles']) |
chem_hallKierAlpha | Hall-Kier alpha value | chem_hallKierAlpha(row['smiles']) |
chem_kappa1 | Hall-Kier kappa1 value | chem_kappa1(row['smiles']) |
chem_kappa2 | Hall-Kier kappa2 value | chem_kappa2(row['smiles']) |
chem_kappa3 | Hall-Kier kappa3 value | chem_kappa3(row['smiles']) |
chem_labuteAsa | Labute's Approximate Surface Area (ASA) | chem_labuteAsa(row['smiles']) |
chem_lipinskiHba | Number of nitrogens and oxygens | chem_lipinskiHba(row['smiles']) |
chem_lipinskiHbd | Number of NHs and OHs | chem_lipinskiHbd(row['smiles']) |
chem_numAliphaticHeterocycles | Number of aliphatic (containing at least one non-aromatic bond) heterocycles | chem_numAliphaticHeterocycles(row['smiles']) |
chem_numAliphaticRings | Number of aliphatic (containing at least one non-aromatic bond) rings | chem_numAliphaticRings(row['smiles']) |
chem_numAmideBonds | Number of amide bonds | chem_numAmideBonds(row['smiles']) |
chem_numAromaticHeterocycles | Number of aromatic heterocycles | chem_numAromaticHeterocycles(row['smiles']) |
chem_numAromaticRings | Number of aromatic rings | chem_numAromaticRings(row['smiles']) |
chem_numAtoms | Number of atoms | chem_numAtoms(row['smiles']) |
chem_numAtomStereoCenters | Number of stereo centers | chem_numAtomStereoCenters(row['smiles']) |
chem_numBridgeheadAtoms | Number of bridgehead atoms (atoms shared between rings that share at least two bonds) | chem_numBridgeheadAtoms(row['smiles']) |
chem_numHba | Number of H-bond acceptors | chem_numHba(row['smiles']) |
chem_numHbd | Number of H-bond donors | chem_numHbd(row['smiles']) |
chem_numHeavyAtoms | Number of heavy atoms | chem_numHeavyAtoms(row['smiles']) |
chem_numHeterocycles | Number of heterocycles | chem_numHeterocycles(row['smiles']) |
chem_numHeteroatoms | Number of heteroatoms | chem_numHeteroatoms(row['smiles']) |
chem_numRings | Number of rings | chem_numRings(row['smiles']) |
chem_numRotatableBonds | Number of rotatable bonds | chem_numRotatableBonds(row['smiles']) |
chem_numSaturatedHeterocycles | Number of saturated heterocycles | chem_numSaturatedHeterocycles(row['smiles']) |
chem_numSaturatedRings | Number of saturated rings | chem_numSaturatedRings(row['smiles']) |
chem_numSpiroAtoms | Number of spiro atoms (atoms shared between rings that share exactly one atom) | chem_numSpiroAtoms(row['smiles']) |
chem_numUnspecifiedAtomStereoCenters | Number of unspecified atom stereo centers | chem_numUnspecifiedAtomStereoCenters(row['smiles']) |
chem_phi | Phi | chem_phi(row['smiles']) |
chem_tpsa | TPSA | chem_tpsa(row['smiles']) |
Examples
Derived IDs
Formulas can be used to automatically generate sequential IDs derived from references to other databases. The example below illustrates how to create derived IDs for preparations of plasmids with the format {plasmid-id}-{prep-number}
.
concat(row['plasmid'].hid, '-', string(generate_sequential_id(row['plasmid'].hid)))