Samples file¶
The samples file (samples.csv) contains the phenotypic information of each sample. The first column contains the sample name, which must be unique, and has to match the name given in the read counts file.
The samples file is a tabular (csv) file with the samples in the rows and the phenotypic data (metadata) in the columns. Note that the platform will not accept purely numerical values as phenotypes.
If we are analyzing a human study (it can be applied to any study) as
seen in the samples.csv
table below, the rows should be anonymized patients, identifyied
uniquely by the first column (sample1, sample2…), and the other columns would be sample metadata or phenotypes (hair color, country, weight, age, etc.).
hair_color |
country |
age |
|
---|---|---|---|
sample1 |
blond |
Japan |
old |
sample2 |
dark |
Switzerland |
young |
sample3 |
blond |
USA |
young |
sample4 |
dark |
Switzerland |
old |
sample5 |
dark |
USA |
old |
As mentioned above, the age was converted from numeric (12, 52, 87) to young and old, since the platform currently does not support continuous values.
Note
All phenotypes must contain at least one alphabet letter. This is done to avoid continuous values (as in the case of weight), since the platform expects discrete ranges. Having excessive numbers of phenotypic groups may also result in errors.
See also
If you are familiar with R, you can think of the samples file as a data.frame object. We provide an example samples file that can be accessed by installing playbase devtools::install_github("bigomics/playbase")
and running playbase::SAMPLES
.