Counts file¶

The file ‘counts’ contains the measurements (genes, proteins, etc..) for each sample listed in the samples file. Just like the samples, the counts.csv file is tabular (.csv), where each row describes the features (genes, proteins, etc..) and each column describes the samples.

The rows contains gene IDs, which can be in most common formats (such as HGCN or Ensembl), but not in the Entrez number format. If you are using Entrez numbers, please convert them to Ensembl IDs using tools such as Syngo.

The values should always be numerical, with the exception of “NA” in case of a lack of data. Failure to do so will result in an error.

Below is a simple example of how a counts.csv file should look like.

	sample1	sample2	sample3	sample4	sample5
gene1	543.6	1556.1	413.0	887.9	123.4
gene2	6.5	14.7	2.3	42.4	56.7
gene3	10.4	763.5	NA	0	89.0
gene4	3217.4	0	4983.2	7493.8	210.2
gene5	98770.5	113498.0	498351.6	88134.1	345.6
gene6	0	NA	14.9	0	789.0
gene7	47648.8	0	32682.0	93873.2	123.4

Note

The formats accepted as features (genes, proteins are ENSEMBL, ENSEMBLTRAN, UNIGENE, REFSEQ, ACCNUM and UNIPROT and gene SYMBOL). Also note that the platform will not accept transcript IDs. You will need to convert them to Gene IDs. This will result in multiple gene entries that the platform will merge.