Skip to main content

Formula columns

Formula columns enable users to create columns whose values are automatically derived from other columns, such as a column of molecular weights calculated from a column of SMILES-encoded chemical structures. This page outlines how users can write formulas.

Cell values

Formulas are evaluated for a single row of a database. Within formulas, the values of the cells of this row can be accessed using the syntax row['<column_key>'].

Constants

Constants are predefined values, such as Pi, that can be used in formulas. Constants make it easier to perform calculations and create complex formulas by providing users standard mathematical and scientific values. To use a constant in a formula, simply type its symbol.

caution

Constants are case-sensitive.

Symbol(s)DescriptionValue
e, EEuler’s number, the base of the natural logarithm.2.718281828
iImaginary unit, defined as i i = -1. A complex number is described as a + b i, where a is the real part, and b is the imaginary part.sqrt(-1)
InfinityInfinity, a number which is larger than the maximum number that can be handled by a floating point number.Infinity
LN2Returns the natural logarithm of 2.0.6931471806
LN10Returns the natural logarithm of 10.2.302585093
LOG2EReturns the base-2 logarithm of E.1.442695041
LOG10EReturns the base-10 logarithm of E.0.4342944819
NaNNot a number.NaN
nullValue null.null
phiPhi is the golden ratio. Two quantities are in the golden ratio if their ratio is the same as the ratio of their sum to the larger of the two quantities. Phi is defined as (1 + sqrt(5)) / 21.618033989
pi, PIThe number pi is a mathematical constant that is the ratio of a circle's circumference to its diameter.3.141592654
SQRT1_2Returns the square root of 1/2.0.7071067812
SQRT2Returns the square root of 2.1.414213562
tauTau is the ratio constant of a circle's circumference to radius, equal to 2 * pi.6.283185307

Operators

Operators are an essential component of formulas. They allow you to perform calculations, make comparisons, and combine logical statements. Our formulas support three types of operators:

  • Arithmetic operators: Perform mathematical calculations on numeric values.
  • Comparison operators: Compare two values and return a Boolean result (true or false).
  • Logical operators: Combine multiple conditions and return a Boolean result (true or false).

Arithmetic operators

SymbolNameSyntaxExample formulaExample result
+Addx + y4 + 59
+Unary plus+y44
-Subtractx - y7 - 34
-Unary minus-y-4-4
*Multiplyx * y2 * 36
/Dividex / y6 / 23
%Percentagex%8%0.08
%Addition with percentagex + y%100 + 3%103
%Subtraction with percentagex - y%100 - 3%97
%Modulusx % y8 % 32
^Powerx ^ y2 ^ 38
!Factorialy!5!120

Logical operators

SymbolNameSyntaxExample formulaExample result
andAndx and ytrue and falsefalse
notNotnot ynot truefalse
orOrx or ytrue or falsetrue
xorXorx xor ytrue xor truefalse

Comparison operators

SymbolNameSyntaxExample formulaExample result
==Equalx == y2 == 4 - 2true
!=Unequalx != y2 != 3true
<Less thanx < y2 < 3true
>Greater thanx > y2 > 3false
<=Less than or equal tox <= y4 <= 3false
>=Greater than or equal tox >= y2 + 4 >= 6true
? :Conditional expressionx ? y : z15 > 100 ? 1 : -1-1

Functions

Formulas support a variety of functions for performing calculations and manipulating data. Furthermore, multiple functions can be combined or used with operators to create complex formulas such as add(abs(row['measurement 1']), ceil(cbrt(row['measurement 2']))).

To use a function in a formula:

  1. In the formula, type the function name followed its argument(s) in parentheses such as functionName(arguments).
  2. Inside the parentheses, specify the column(s) you want to use as arguments using the syntax row['<column_key>'].
  3. For functions that accept multiple arguments, separate the arguments with commas.

Foundational functions

FunctionDescriptionExample formulaExample result
abs(x)Calculates the absolute value of a number in the specified columnabs(row['relative_fluorescence'])abs(-5) = 5
add(x, y)Adds the values from two specified columnsadd(row['initial_mass'], row['final_mass'])add(10, 5) = 15
cbrt(x)Calculates the cubic root of a value in the specified columncbrt(row['cell_volume'])cbrt(27) = 3
ceil(x)Rounds a value towards plus infinityceil(row['growth_rate'])ceil(4.3) = 5
cube(x)Computes the cube of a value in the specified columncube(row['cell_diameter'])cube(3) = 27
divide(x, y)Divides the value from the first column by the value from the second columndivide(row['total_mass'], row['sample_count'])divide(20, 4) = 5
exp(x)Calculates the exponential of a value in the specified columnexp(row['log_intensity'])exp(1) = 2.718
expm1(x)Calculates the value of subtracting 1 from the exponential valueexpm1(row['log_intensity'])expm1(1) = 1.718
fix(x)Rounds a value towards zerofix(row['growth_rate'])fix(4.7) = 4
floor(x)Rounds a value towards minus infinityfloor(row['growth_rate'])floor(4.7) = 4
gcd(x, y)Calculates the greatest common divisor for values in two columnsgcd(row['value1'], row['value2'])gcd(8, 12) = 4
hypot(x, y, ...)Calculates the hypotenuse of values in specified columnshypot(row['value1'], row['value2'])hypot(3, 4) = 5
lcm(x, y)Calculates the least common multiple for values in two columnslcm(row['value1'], row['value2'])lcm(4, 6) = 12
log(x [, base])Calculates the logarithm of a value in the specified columnlog(row['intensity'], 2)log(8, 2) = 3
log10(x)Calculates the base-10 logarithm of a valuelog10(row['intensity'])log10(100) = 2
log1p(x)Calculates the logarithm of a value + 1 i.e. log(x+1)log1p(row['intensity'])log1p(1) = 0.693
log2(x)Calculates the base-2 logarithm of a valuelog2(row['intensity'])log2(8) = 3
mod(x, y)Calculates the modulus (remainder) of the division of two valuesmod(row['value1'], row['value2'])mod(10, 3) = 1
multiply(x, y)Multiplies values from two specified columnsmultiply(row['length'], row['width'])multiply(4, 5) = 20
nthRoot(x, n)Calculates the nth root of a valuenthRoot(row['value'], 3)nthRoot(27, 3) = 3
pow(x, y)Calculates the power of one value to anotherpow(row['base'], row['exponent'])pow(2, 3) = 8
round(x [, n])Rounds a value to the nearest integer or specified decimal placesround(row['intensity'], 2)round(4.567, 2) = 4.57
sign(x)Computes the sign of a valuesign(row['intensity'])sign(-5) = -1
sqrt(x)Calculates the square root of a valuesqrt(row['intensity'])sqrt(16) = 4
square(x)Computes the square of a valuesquare(row['cell_diameter'])square(3) = 9
subtract(x, y)Subtracts the second value from the first valuesubtract(row['final_mass'], row['initial_mass'])subtract(10, 5) = 5
unaryMinus(x)Inverses the sign of a valueunaryMinus(row['growth_rate'])unaryMinus(5) = -5
unaryPlus(x)Unary plus operation (returns the same value)unaryPlus(row['growth_rate'])unaryPlus(5) = 5
format(x [, precision])Format a value of any type into a stringformat(row['value'], 2)format(5.6789, 2) = "5.68"
compare(x, y)Compare two values in specified columns. Returns 1 if value_1 > value_2. Returns 0 if value_1=value_2. Returns -1 if value_1<value_2compare(row['length1'], row['length2'])compare(5, 3) = 1
equal(x, y)Test whether two values are equalequal(row['mass1'], row['mass2'])equal(5, 5) = true
equalText(x, y)Check equality of two stringsequalText(row['name1'], row['name2'])equalText("apple", "apple") = true
larger(x, y)Test whether value in the first column is larger than in the second columnlarger(row['mass1'], row['mass2'])larger(10, 5) = true
largerEq(x, y)Test whether value in the first column is larger or equal to the value in the second columnlargerEq(row['mass1'], row['mass2'])largerEq(10, 10) = true
smaller(x, y)Test whether value in the first column is smaller than the value in the second columnsmaller(row['mass1'], row['mass2'])smaller(5, 10) = true
smallerEq(x, y)Test whether value in the first column is smaller or equal to the value in the second columnsmallerEq(row['mass1'], row['mass2'])smallerEq(5, 5) = true
unequal(x, y)Test whether two values are unequalunequal(row['mass1'], row['mass2'])unequal(5, 10) = true
hasNumericValue(x)Test whether a value is a numeric valuehasNumericValue(row['value'])hasNumericValue(5) = true
isInteger(x)Test whether a value is an integer numberisInteger(row['value'])isInteger(5) = true
isNaN(x)Test whether a value is NaN (not a number)isNaN(row['value'])isNan(5) = false
isNegative(x)Test whether a value is negative (smaller than zero)isNegative(row['value'])isNegative(-5) = true
isNumeric(x)Test whether a value is a numeric valueisNumeric(row['value'])isNumeric(5) = true
isPositive(x)Test whether a value is positive (larger than zero)isPositive(row['value'])isPositive(5) = true
isPrime(x)Test whether a value is prime (has no divisors other than itself and one)isPrime(row['value'])isPrime(7) = true
isZero(x)Test whether a value is zeroisZero(row['value'])isZero(0) = true

Bioinformatics functions

Our formulas also support specialized functions for life science, such as for calculating properties of chemical structures and protein sequences. These functions enable you to perform calculations on your IP without sending your IP to external servers.

FunctionDescriptionExample formulaExample result
protein_seq_molecular_weight(x)Calculates the molecular weight of a protein sequence by summing the average atomic masses of the amino acids and accounting for water loss during peptide bond formation.protein_seq_molecular_weight(row['sequence'])Molecular weight in Daltons (Da)
protein_seq_hydrophobicity(x)Evaluates the overall hydrophobicity of a protein sequence using established hydrophobicity scales for each amino acid residue.protein_seq_hydrophobicity(row['sequence'])Hydrophobicity score
protein_seq_isoelectric_point(x)Determines the isoelectric point (pI) of a protein sequence, the pH at which the protein has no net charge.protein_seq_isoelectric_point(row['sequence'])pH value
protein_seq_molar_extinction_coefficient_reduced(x)Computes the molar extinction coefficient of a protein sequence under reducing conditions, considering only the contributions of free amino acids.protein_seq_molar_extinction_coefficient_reduced(row['sequence'])Molar extinction coefficient (M⁻¹cm⁻¹)
protein_seq_molar_extinction_coefficient_cystines(x)Calculates the molar extinction coefficient of a protein sequence in non-reducing conditions, accounting for disulfide bonds (cystines).protein_seq_molar_extinction_coefficient_cystines(row['sequence'])Molar extinction coefficient (M⁻¹cm⁻¹)

Cheminformatics functions

FunctionDescriptionExample
chem_amwAtomic molecular weightchem_amw(row['smiles'])
chem_chi0nchi0nchem_chi0n(row['smiles'])
chem_chi0vchi0vchem_chi0v(row['smiles'])
chem_chi1nchi1nchem_chi1n(row['smiles'])
chem_chi1vchi1vchem_chi1v(row['smiles'])
chem_chi2nchi2nchem_chi2n(row['smiles'])
chem_chi2vchi2vchem_chi2v(row['smiles'])
chem_chi3nchi3nchem_chi3n(row['smiles'])
chem_chi3vchi3vchem_chi3v(row['smiles'])
chem_chi4nchi4nchem_chi4n(row['smiles'])
chem_chi4vchi4vchem_chi4v(row['smiles'])
chem_crippenClogPWildman-Crippen LogP valuechem_crippenClogP(row['smiles'])
chem_crippenMrWildman-Crippen MR valuechem_crippenMr(row['smiles'])
chem_exactMwExact molecular weightchem_exactMw(row['smiles'])
chem_fractionCSP3Fraction of C atoms that are SP3 hybridizedchem_fractionCSP3(row['smiles'])
chem_hallKierAlphaHall-Kier alpha valuechem_hallKierAlpha(row['smiles'])
chem_kappa1Hall-Kier kappa1 valuechem_kappa1(row['smiles'])
chem_kappa2Hall-Kier kappa2 valuechem_kappa2(row['smiles'])
chem_kappa3Hall-Kier kappa3 valuechem_kappa3(row['smiles'])
chem_labuteAsaLabute's Approximate Surface Area (ASA)chem_labuteAsa(row['smiles'])
chem_lipinskiHbaNumber of nitrogens and oxygenschem_lipinskiHba(row['smiles'])
chem_lipinskiHbdNumber of NHs and OHschem_lipinskiHbd(row['smiles'])
chem_numAliphaticHeterocyclesNumber of aliphatic (containing at least one non-aromatic bond) heterocycleschem_numAliphaticHeterocycles(row['smiles'])
chem_numAliphaticRingsNumber of aliphatic (containing at least one non-aromatic bond) ringschem_numAliphaticRings(row['smiles'])
chem_numAmideBondsNumber of amide bondschem_numAmideBonds(row['smiles'])
chem_numAromaticHeterocyclesNumber of aromatic heterocycleschem_numAromaticHeterocycles(row['smiles'])
chem_numAromaticRingsNumber of aromatic ringschem_numAromaticRings(row['smiles'])
chem_numAtomsNumber of atomschem_numAtoms(row['smiles'])
chem_numAtomStereoCentersNumber of stereo centerschem_numAtomStereoCenters(row['smiles'])
chem_numBridgeheadAtomsNumber of bridgehead atoms (atoms shared between rings that share at least two bonds)chem_numBridgeheadAtoms(row['smiles'])
chem_numHbaNumber of H-bond acceptorschem_numHba(row['smiles'])
chem_numHbdNumber of H-bond donorschem_numHbd(row['smiles'])
chem_numHeavyAtomsNumber of heavy atomschem_numHeavyAtoms(row['smiles'])
chem_numHeterocyclesNumber of heterocycleschem_numHeterocycles(row['smiles'])
chem_numHeteroatomsNumber of heteroatomschem_numHeteroatoms(row['smiles'])
chem_numRingsNumber of ringschem_numRings(row['smiles'])
chem_numRotatableBondsNumber of rotatable bondschem_numRotatableBonds(row['smiles'])
chem_numSaturatedHeterocyclesNumber of saturated heterocycleschem_numSaturatedHeterocycles(row['smiles'])
chem_numSaturatedRingsNumber of saturated ringschem_numSaturatedRings(row['smiles'])
chem_numSpiroAtomsNumber of spiro atoms (atoms shared between rings that share exactly one atom)chem_numSpiroAtoms(row['smiles'])
chem_numUnspecifiedAtomStereoCentersNumber of unspecified atom stereo centerschem_numUnspecifiedAtomStereoCenters(row['smiles'])
chem_phiPhichem_phi(row['smiles'])
chem_tpsaTPSAchem_tpsa(row['smiles'])

Examples

Derived IDs

Formulas can be used to automatically generate sequential IDs derived from references to other databases. The example below illustrates how to create derived IDs for preparations of plasmids with the format {plasmid-id}-{prep-number}.

concat(row['plasmid'].hid, '-', string(generate_sequential_id(row['plasmid'].hid)))