ROBOT Report - stato

Download TSV

Types of errors

LevelNumber of errors
INFO381
WARN16

Error breakdown

RuleNumber of errors
lowercase_definition381
equivalent_class_axiom_no_genus14
duplicate_exact_synonym2

Click on any term to redirect to the term page.

Row Level Rule Name Subject Property Value
0 WARN duplicate_exact_synonym STATO:0000636 IAO:0000118 NNS
1 WARN duplicate_exact_synonym STATO:0000637 IAO:0000118 NNS
2 WARN equivalent_class_axiom_no_genus STATO:0000027 OBI:0000417 STATO:0000121
3 WARN equivalent_class_axiom_no_genus STATO:0000033 OBI:0000312 OBI:0200117
4 WARN equivalent_class_axiom_no_genus STATO:0000085 OBI:0000295 STATO:0000175
5 WARN equivalent_class_axiom_no_genus STATO:0000119 OBI:0000299 STATO:0000144
6 WARN equivalent_class_axiom_no_genus STATO:0000131 OBI:0000417 STATO:0000183
7 WARN equivalent_class_axiom_no_genus STATO:0000133 BFO:0000062 OBI:0200201
8 WARN equivalent_class_axiom_no_genus STATO:0000137 OBI:0000417 STATO:0000226
9 WARN equivalent_class_axiom_no_genus STATO:0000191 OBI:0000417 STATO:0000224
10 WARN equivalent_class_axiom_no_genus STATO:0000202 OBI:0000417 STATO:0000253
11 WARN equivalent_class_axiom_no_genus STATO:0000247 OBI:0000417 STATO:0000173
12 WARN equivalent_class_axiom_no_genus STATO:0000279 OBI:0000417 STATO:0000255
13 WARN equivalent_class_axiom_no_genus STATO:0000337 OBI:0000299 STATO:0000485
14 WARN equivalent_class_axiom_no_genus STATO:0000443 OBI:0000417 STATO:0000439
15 WARN equivalent_class_axiom_no_genus STATO:0000471 STATO:0000403 STATO:0000039
16 INFO lowercase_definition STATO:0000001 IAO:0000115 property to indicate that a design declares a variable; the inverse property is 'is declared by'@en
17 INFO lowercase_definition STATO:0000002 IAO:0000115 an electronic file is an information content entity which conforms to a specification or format and which is meant to hold data and information in digital form, accessible to software agents@en
18 INFO lowercase_definition STATO:0000003 IAO:0000115 a balanced design is a an experimental design where all experimental group have the an equal number of subject observations@en
19 INFO lowercase_definition STATO:0000004 IAO:0000115 property to indicate the variables declared by a design; the inverse property is 'declares'@en
20 INFO lowercase_definition STATO:0000005 IAO:0000115 a single factor design is a study design which declares exactly 1 independent variable@en
21 INFO lowercase_definition STATO:0000006 IAO:0000115 x-axis is a cartesian coordinate axis which is orthogonal to the y-axis and the z-axis@en
22 INFO lowercase_definition STATO:0000007 IAO:0000115 an axis is a line graph used as reference line for the measurement of coordinates.@en
23 INFO lowercase_definition STATO:0000008 IAO:0000115 y-axis is a cartesian coordinate axis which is orthogonal to the x-axis and the z-axis@en
24 INFO lowercase_definition STATO:0000011 IAO:0000115 a cartesian axis is one of 3 the axis in a cartesian coordinate system defining a referential in 3 dimensions. each of the axis is orthogonal to the other 2@en
25 INFO lowercase_definition STATO:0000012 IAO:0000115 z-axis is a cartesian coordinate axis which is orthogonal to the x-axis and the y-axis@en
26 INFO lowercase_definition STATO:0000013 IAO:0000115 a 2 dimensional cartesian coordinate system is a cartesian coordinate system which defines 2 orthogonal one dimensional axes and which may be used to describe a 2 dimensional spatial region.
27 INFO lowercase_definition STATO:0000019 IAO:0000115 normal distribution hypothesis is a goodness of fit hypothesis stating that the distribution computed from the sample population fits a normal distribution.@en
28 INFO lowercase_definition STATO:0000021 IAO:0000115 a confidence interval which covers 90% of the sampling distribution, meaning that there is a 90% risk of false positive (type I error)@en
29 INFO lowercase_definition STATO:0000024 IAO:0000115 a three dimensional cartesian coordinate system is a cartesian coordinate system which defines 3 orthogonal one dimensional axes and which may be used to describe a 3 dimensional spatial region.
30 INFO lowercase_definition STATO:0000027 IAO:0000115 linkage between 2 categorical variable test is a statistical test which evaluates if there is an association between a predictor variable assuming discrete values and a response variable also assuming discrete values@en
31 INFO lowercase_definition STATO:0000028 IAO:0000115 measure of variation or statistical dispersion is a data item which describes how much a theoritical distribution or dataset is spread.@en
32 INFO lowercase_definition STATO:0000029 IAO:0000115 a measure of central tendency is a data item which attempts to describe a set of data by identifying the value of its centre.@en
33 INFO lowercase_definition STATO:0000031 IAO:0000115 binary classification (or binomial classification) is a data transformation which aims to cast members of a set into 2 disjoint groups depending on whether the element have a given property/feature or not.@en
34 INFO lowercase_definition STATO:0000032 IAO:0000115 an alternative term used for STATO statistical ontology and ISA team@en
35 INFO lowercase_definition STATO:0000034 IAO:0000115 a model parameter is a data item which is part of a model and which is meant to characterize an theoritecal or unknown population. a model parameter may be estimated by considering the properties of samples presumably taken from the theoritecal population@en
36 INFO lowercase_definition STATO:0000035 IAO:0000115 the range is a measure of variation which describes the difference between the lowest score and the highest score in a set of numbers (a data set)
37 INFO lowercase_definition STATO:0000038 IAO:0000115 a set of 2 subjects which result from a pairing process which assigns subject to a set based on a pairing rule/criteria@en
38 INFO lowercase_definition STATO:0000039 IAO:0000115 a statistic is a measurement datum to describe a dataset or a variable. It is generated by a calculation on set of observed data.@en
39 INFO lowercase_definition STATO:0000040 IAO:0000115 an MA plot is a scatter plot of the log intensity ratios M = log_2(T/R) versus the average log intensities A = log_2(T*T)/2, where T and R represent the signal intensities in the test and reference channels respectively.@en
40 INFO lowercase_definition STATO:0000041 IAO:0000115 a R command syntax or link to a R documentation in support of Statistical Ontology Classes or Data Transformations@en
41 INFO lowercase_definition STATO:0000043 IAO:0000115 a false positive rate whose value is 5 per cent@en
42 INFO lowercase_definition STATO:0000044 IAO:0000115 one-way anova is an analysis of variance where the different groups being compared are associated with the factor levels of only one independent variable. The null hypothesis is an absence of difference between the means calculated for each of the groups. The test assumes normality and equivariance of the data.@en
43 INFO lowercase_definition STATO:0000045 IAO:0000115 two-way anova is an analysis of variance where the different groups being compared are associated the factor levels of exatly 2 independent variables. The null hypothesis is an absence of difference between the means calculated for each of the groups. The test assumes normality and equivariance of the data.@en
44 INFO lowercase_definition STATO:0000046 IAO:0000115 a block design is a kind of study design which declares a blocking variable (also known as nuisance variable) in order to account for a known source of variation and reduce its impact on the acquisition of the signal@en
45 INFO lowercase_definition STATO:0000047 IAO:0000115 a count is a data item denoted by an integer and representing the number of instances or occurences of an entity@en
46 INFO lowercase_definition STATO:0000050 IAO:0000115 signal to noise ratio is a measurement datum comparing the amount of meaningful, useful or interesting data (the signal) to the amount of irrelevant or false data (the noise). Depending on the field and domain of application, different variables will be used to determinate a 'signal to noise ratio'. In statistics, the definition of signal to noise ratio is the ratio of the mean of a measurement to its standard deviation. It thus corresponds to the inverse of the coefficient of variation@en
47 INFO lowercase_definition STATO:0000053 IAO:0000115 a false positive rate is a data item which accounts for the proportion of incorrect rejection of a true null hypothesis.@en
48 INFO lowercase_definition STATO:0000054 IAO:0000115 homoskedasticity states that all variances under consideration are homogenous.@en
49 INFO lowercase_definition STATO:0000055 IAO:0000115 chromosome coordinate system is a genomic coordinate which uses chromosome of a particular assembly build process to define start and end positions. This coordinate system is unstable and will change with each new genome sequence assembly build.@en
50 INFO lowercase_definition STATO:0000056 IAO:0000115 a null hypothesis which states that no linkage exists between 2 categorical variables@en
51 INFO lowercase_definition STATO:0000058 IAO:0000115 goodness of fit hypothesis is a null hypothesis stating that the distribution computed from the sample population fits a theoretical distribution or that a dataset can be correctly explained by a model@en
52 INFO lowercase_definition STATO:0000059 IAO:0000115 the Student's t distribution is a continuous probability distribution which arises when estimating the mean of a normally distributed population in situations where the sample size is small and population standard deviation is unknown.@en
53 INFO lowercase_definition STATO:0000060 IAO:0000115 hypergeometric distribution is a probability distribution that describes the probability of k successes in n draws from a finite population of size N containing K successes without replacement@en
54 INFO lowercase_definition STATO:0000062 IAO:0000115 is a null hypothesis stating that there are no difference observed across a series of measurements made one same subject.@en
55 INFO lowercase_definition STATO:0000063 IAO:0000115 genomic coordinate datum is a data item which denotes a genomic position expressed using a genomic coordinate system@en
56 INFO lowercase_definition STATO:0000064 IAO:0000115 sequence read count is a data item determining how many sequence reads have been generated by a DNA sequencing assay for a given stretch of DNA
57 INFO lowercase_definition STATO:0000067 IAO:0000115 a continuous probability distribution is a probability distribution which is defined by a probability density function@en
58 INFO lowercase_definition STATO:0000071 IAO:0000115 reaction rate is a measurement datum which represents the speed of a chemical reaction turning reactive species into product species of event (i.e the number of such conversions)s occuring over a time interval@en
59 INFO lowercase_definition STATO:0000072 IAO:0000115 substrate concentration is a scalar measurement datum which denotes the amount of molecular entity involved in an enzymatic reaction (or catalytic chemical reaction) and whose role in that reaction is as substrate.@en
60 INFO lowercase_definition STATO:0000075 IAO:0000115 a rarefaction curve is a graph used for estimating species richness in ecology studies@en
61 INFO lowercase_definition STATO:0000080 IAO:0000115 the Brown Forsythe test is a statistical test which evaluates if the variance of different groups are equal. It relies on computing the median rather than the mean, as used in the Levene's test for homoschedacity. This test maybe used to, for instance, ensure that the conditions of applications of ANOVA are met.@en
62 INFO lowercase_definition STATO:0000082 IAO:0000115 a fixed effect model is a statistical model which represents the observed quantities in terms of explanatory variables that are treated as if the quantities were non-random.@en
63 INFO lowercase_definition STATO:0000084 IAO:0000115 multinomial logistic regression model is a model which attempts to explain data distribution associated with *polychotomous* response/dependent variable in terms of values assumed by the independent variable uses a function of predictor/independent variable(s): the function used in this instance of regression modeling is probit function.@en
64 INFO lowercase_definition STATO:0000085 IAO:0000115 effect size estimate is a data item about the direction and strength of the consequences of a causative agent as explored by statistical methods. Those methods produce estimates of the effect size, e.g. confidence interval@en
65 INFO lowercase_definition STATO:0000086 IAO:0000115 an F-test is a statistical test which evaluates that the computed test statistics follows an F-distribution under the null hypothesis. The F-test is sensitive to departure from normality. F-test arise when decomposing the variability in a data set in terms of sum of squares.@en
66 INFO lowercase_definition STATO:0000087 IAO:0000115 a polychotomous variable is a categorical variable which is defined to have minimally 2 categories or possible values@en
67 INFO lowercase_definition STATO:0000088 IAO:0000115 statistical sample size is a count evaluating the number of individual experimental units@en
68 INFO lowercase_definition STATO:0000089 IAO:0000115 a case-control study design is a observation study design which assess the risk of particular outcome (a trait or a disease) associated with an event (either an exposure or endogenous factor). A case-control study design therefore declares an exposure variable which is dichotomous in nature (exposed/non-exposed) and an outcome variable, which is also dichotomous (case or control), thus giving the name to the design. During the execution of the design, a case control study defines a population and counts the events to determine their frequency.@en
69 INFO lowercase_definition STATO:0000090 IAO:0000115 a dichotomous variable is a categorical variable which is defined to have only 2 categories or possible values@en
70 INFO lowercase_definition STATO:0000095 IAO:0000115 paired t-test is a statistical test which is specifically designed to analysis differences between paired observations in the case of studies realizing repeated measures design with only 2 repeated measurements per subject (before and after treatment for example)@en
71 INFO lowercase_definition STATO:0000096 IAO:0000115 stratification is a planned process which executes a stratification rule using as input a population and assign it member to mutually exclusive subpopulation based on the values defined by the stratification rule@en
72 INFO lowercase_definition STATO:0000099 IAO:0000115 a random effect(s) model, also called a variance components model, is a kind of hierarchical linear model. It assumes that the dataset being analysed consists of a hierarchy of different populations whose differences relate to that hierarchy.@en
73 INFO lowercase_definition STATO:0000100 IAO:0000115 standardized mean difference is statistic computed by forming the difference between two means, divided by an estimate of the within-group standard deviation. It is used to provide an estimation of the effect size between two treatments when the predictor (independent variable) is categorical and the response(dependent) variable is continuous. A standardized mean difference is a statistic that is a difference between two means, divided by a statistical measure of dispersion. The term Standardized Mean Difference is a description of the concept without an explicit type of statistical measure of dispersion. If the statistical measure of dispersion is specified, then a type (child term) of Standardized Mean Difference is preferred.@en
74 INFO lowercase_definition STATO:0000101 IAO:0000115 the relationship between a fraction and the number above the line@en
75 INFO lowercase_definition STATO:0000102 IAO:0000115 relationship between a planned process and the plan specification that it carries out; it is defined as equivalent to the composed relationship (realizes o concretizes)@en
76 INFO lowercase_definition STATO:0000103 IAO:0000115 the multinomial distribution is a probability distribution which gives the probability of any particular combination of numbers of successes for various categories defined in the context of n independent trials each of which leads to a success for exactly one of k categories, with each category having a given fixed success probability.@en
77 INFO lowercase_definition STATO:0000105 IAO:0000115 log signal intensity ratio is a data item which corresponding the logarithmitic base 2 of the ratio between 2 signal intensity, each corresponding to a condition.@en
78 INFO lowercase_definition STATO:0000106 IAO:0000115 probit regression model is a model which attempts to explain data distribution associated with *dichotomous* response/dependent variable in terms of values assumed by the independent variable uses a function of predictor/independent variable(s): the function used in this instance of regression modeling is the probit function aka the quantile function, i.e., the inverse cumulative distribution function (CDF), associated with the standard normal distribution.@en
79 INFO lowercase_definition STATO:0000107 IAO:0000115 a statistical model is an information content entity which is a formalization of relationships between variables in the form of mathematical equations. A statistical model describes how one or more random variables are related to one or more other variables. The model is statistical as the variables are not deterministically but stochastically related.@en
80 INFO lowercase_definition STATO:0000108 IAO:0000115 linear regression model is a model which attempts to explain data distribution associated with response/dependent variable in terms of values assumed by the independent variable uses a linear function or linear combination of the regression parameters and the predictor/independent variable(s). linear regression modeling makes a number of assumptions, which includes homoskedasticity (constance of variance)@en
81 INFO lowercase_definition STATO:0000109 IAO:0000115 multinomial logistic regression model is a model which attempts to explain data distribution associated with *polychotomous* response/dependent variable in terms of values assumed by the independent variable uses a function of predictor/independent variable(s): the function used in this instance of regression modeling is logistic function.@en
82 INFO lowercase_definition STATO:0000111 IAO:0000115 a sequence read is a DNA sequence data which is generated by a DNA sequencer@en
83 INFO lowercase_definition STATO:0000112 IAO:0000115 a Funnel plot is a scatter plot of treatment effect versus a measure of study size and aims to provide a visual aid to detecting bias or systematic heterogeneity. A symmetric inverted funnel shape arises from a ‘well-behaved’ data set, in which publication bias is unlikely. An asymmetric funnel indicates a relationship between treatment effect and study size. Known caveats: If high precision studies really are different from low precision studies with respect to effect size (e.g., due to different populations examined) a funnel plot may give a wrong impression of publication bias. The appearance of the funnel plot can change quite dramatically depending on the scale on the y-axis — whether it is the inverse square error or the trial size. Funnel plot was introduced by Light and Palmer in 1984.@en
84 INFO lowercase_definition STATO:0000113 IAO:0000115 variance is a data item about a random variable or probability distribution. it is equivalent to the square of the standard deviation. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean (expected value).The variance is the second moment of a distribution.@en
85 INFO lowercase_definition STATO:0000114 IAO:0000115 relationship between an element and a set it belongs to@en
86 INFO lowercase_definition STATO:0000115 IAO:0000115 relationship between a set and one of its elements@en
87 INFO lowercase_definition STATO:0000116 IAO:0000115 the process of using statistical analysis for interpreting and communicating \"what the data say\".@en
88 INFO lowercase_definition STATO:0000117 IAO:0000115 a discrete probability distribution is a probability distribution which is defined by a probability mass function where the random variable can only assume a finite number of values or infinitely countable values@en
89 INFO lowercase_definition STATO:0000118 IAO:0000115 ranking is a data transformation which turns a non-ordinal variable into a Ordinal variable by sorting the values of the input variable and replacing their value by their position in the sorting result@en
90 INFO lowercase_definition STATO:0000119 IAO:0000115 model parameter estimation is a data transformation that finds parameter values (the model parameter estimates) most compatible with the data as judged by the model.@en
91 INFO lowercase_definition STATO:0000120 IAO:0000115 beanplot is a plot in which (one or) multiple batches (\"beans\") are shown. Each bean consists of a density trace, which is mirrored to form a polygon shape. Next to that, a one-dimensional scatter plot shows all the individual measurements, like in a stripchart. The name beanplot stems from green beans. The density shape can be seen as the pod of a green bean, while the scatter plot shows the seeds inside the pod.@en
92 INFO lowercase_definition STATO:0000121 IAO:0000115 the objective of a data transformation to evaluate a null hypothesis of absence of linkage between variables.@en
93 INFO lowercase_definition STATO:0000122 IAO:0000115 a pedigree chart is a graph which plots parent child relations@en
94 INFO lowercase_definition STATO:0000123 IAO:0000115 r2 is a correlation coefficient which is computed over the frequency of 2 dichotomous variable and is used as a measure of Linkage Disequilibrium and as input data item to the creation of an LD plot@en
95 INFO lowercase_definition STATO:0000124 IAO:0000115 a stratification rule/criteria is a criteria used to determine population strata so that a stratification process implementing the rule can result in any member of the total population being assigned to one and only one stratum@en
96 INFO lowercase_definition STATO:0000126 IAO:0000115 volcano plot is a kind of scatter plot which graphs the negative log of the p-value (significance) on the y-axis versus log2 of fold-change between 2 conditions on the x-axis. It is a popular method for visualizing differential occurence of variables between 2 conditions.@en
97 INFO lowercase_definition STATO:0000127 IAO:0000115 a confidence interval which covers 99% of the sampling distribution, meaning that there is a 1% risk of false positive (type I error)@en
98 INFO lowercase_definition STATO:0000130 IAO:0000115 the Breslow-Day test is a statistical test which evaluates if the odds ratios are homogenous across N 2x2 contingency tables, for instance several 2x2 contingency tables associated with different strata of a stratified population when evaluating the relationship between exposure and outcome or associated with the different samples coming from several centres in a multicentric study in clinical trial context.@en
99 INFO lowercase_definition STATO:0000131 IAO:0000115 a sphericity test is a null hypothesis statistical testing procedure which posits a null hypothesis of equality of the variances of the differences between levels of the repeated measures factor@en
100 INFO lowercase_definition STATO:0000134 IAO:0000115 specificity is a measurement datum qualifying a binary classification test and is computed by substracting the false positive rate to the integral numeral 1@en
101 INFO lowercase_definition STATO:0000135 IAO:0000115 strictly standardized mean difference (SSMS) is a standardized mean difference which corresponds to the ratio of mean to the standard deviation of the difference between two groups. SSMD directly measures the magnitude of difference between two groups. SSMD is widely used in High Content Screen for hit selection and quality control. When the data is preprocessed using log-transformation as normally done in HTS experiments, SSMD is the mean of log fold change divided by the standard deviation of log fold change with respect to a negative reference. In other words, SSMD is the average fold change (on the log scale) penalized by the variability of fold change (on the log scale). For quality control, one index for the quality of an HTS assay is the magnitude of difference between a positive control and a negative reference in an assay plate. For hit selection, the size of effects of a compound (i.e., a small molecule or an siRNA) is represented by the magnitude of difference between the compound and a negative reference. SSMD directly measures the magnitude of difference between two groups. Therefore, SSMD can be used for both quality control and hit selection in HTS experiments.@en
102 INFO lowercase_definition STATO:0000137 IAO:0000115 an homoskedasticity test is a statistical test aiming at evaluate if the variances from several random samples are similar@en
103 INFO lowercase_definition STATO:0000138 IAO:0000115 a 2x2 contingency table is a contingency table build for 2 dichotomous variables (i.e. 2 categorical variables, each with only 2 possible outcomes). It is the simplest of contingency tables@en
104 INFO lowercase_definition STATO:0000139 IAO:0000115 a subject pairing is a planned process which executes a pairing rule and results in the creation of sets of 2 subjects meeting the pairing criteria@en
105 INFO lowercase_definition STATO:0000140 IAO:0000115 a contigency table is a data item which displays the (multivariate) frequency distribution of the possible values of categorical variables. The first row of the table corresponds to categories of one categorical variable, the first column of the table corresponds to categories of the other categorical variable, the cells corresponding to each combination of categories is filled with the observed occurences in the sample being considered. The table also contains marginal total (marginal sums) and grand total of the occurences The term contingency table was first used by Karl Pearson in \"On the Theory of Contingency and Its Relation to Association and Normal Correlation\", part of the Drapers' Company Research Memoirs Biometric Series I published in 1904.@en
106 INFO lowercase_definition STATO:0000141 IAO:0000115 acute toxicity study is an investigation which use interventions organized according to a factorial design and a parallel group design to observe the effect of use of high dose xenobiotics in animal models or cellular models@en
107 INFO lowercase_definition STATO:0000144 IAO:0000115 a model parameter estimate is a data item which results from a model parameter estimation process and which provides a numerical value about a model parameter.@en
108 INFO lowercase_definition STATO:0000145 IAO:0000115 the geometric distribution is a negative binomial distribution where r is 1. It is useful for modeling the runs of consecutive successes (or failures) in repeated independent trials of a system. The geometric distribution models the number of successes before one failure in an independent succession of tests where each test results in success or failure. The geometric distribution with prob = p has density p(x) = p (1-p)^x for x = 0, 1, 2, …, 0 < p ≤ 1. If an element of x is not integer, the result of dgeom is zero, with a warning. The quantile is defined as the smallest value x such that F(x) ≥ p, where F is the distribution function.@en
109 INFO lowercase_definition STATO:0000146 IAO:0000115 a null hypothesis stating that there are differences observed between group of subjects@en
110 INFO lowercase_definition STATO:0000149 IAO:0000115 binomial logistic regression model is a model which attempts to explain data distribution associated with *dichotomous* response/dependent variable in terms of values assumed by the independent variable uses a function of predictor/independent variable(s): the function used in this instance of regression modeling is logistic function.@en
111 INFO lowercase_definition STATO:0000150 IAO:0000115 a minimum value is a data item which denotes the smallest value found in a dataset or resulting from a calculation.@en
112 INFO lowercase_definition STATO:0000151 IAO:0000115 maximum value is a data item which denotes the largest value found in a dataset or resulting from a calculation.@en
113 INFO lowercase_definition STATO:0000152 IAO:0000115 a quartile is a quantile which splits data into sections accrued of 25% of data, so the first quartile delineates 25% of the data, the second quartile delineates 50% of the data and the third quartile, 75 % of the data@en
114 INFO lowercase_definition STATO:0000154 IAO:0000115 a violin plot is a plot combining the features of box plot and kernel density plot. The violin plot is therefore similar to box plot but it incorporated in the display the probability density of the data at different values. Typically violin plots will include a marker for the median of the data and a box indicating the interquartile range, as in standard box plots.@en
115 INFO lowercase_definition STATO:0000155 IAO:0000115 meta-analysis is a data transformation which uses the effect size estimates from several independent quantitative scientific studies addressing the same question in order to assess finding consistency.@en
116 INFO lowercase_definition STATO:0000156 IAO:0000115 the Scheffe test is a data transformation which evaluates all possible contrasts and adjusting the levels significance by accounting for multiple comparison. The test is therefore conservative. Confidence intervals can be constructed for the corresponding linear regression. It was developped by American statistician Henry Scheffe in 1959.@en
117 INFO lowercase_definition STATO:0000157 IAO:0000115 the LSD test is a statistical test for multiple comparisons of treatments by means of least significant difference following an ANOVA analysis
118 INFO lowercase_definition STATO:0000158 IAO:0000115 a null hypothesis which states that a linkage exists between 2 categorical variables@en
119 INFO lowercase_definition STATO:0000161 IAO:0000115 variable distribution is data item which denotes the spatial resolution of data point making up a variable. variable distribution may be compared to a known probability distribution using goodness of fit test or plotting a quantile-quantile plot for visual assessment of the fit.@en
120 INFO lowercase_definition STATO:0000162 IAO:0000115 the role played by an entity part of study group as defined by an experimental design and realized in a data analysis and data interpretation@en
121 INFO lowercase_definition STATO:0000163 IAO:0000115 trimmed mean or truncated mean is a measure of central tendency which involves the calculation of the mean after discarding given parts of a probability distribution or sample at the high and low end, and typically discarding an equal amount of both@en
122 INFO lowercase_definition STATO:0000165 IAO:0000115 a pie chart is a graph in which a circular graph is divided into sector illustrating numerical proportion, meaning that the arc length of each sector (and consequently its central angle and area), is proportional to the quantity it represents.@en
123 INFO lowercase_definition STATO:0000166 IAO:0000115 the bart chart is a graph resulting from plotting rectangular bars with lengths proportional to the values that they represent.
124 INFO lowercase_definition STATO:0000167 IAO:0000115 the first quartile is a quartile which splits the lower 25 % of the data@en
125 INFO lowercase_definition STATO:0000168 IAO:0000115 a real time quantitative pcr plot is a line graph which plots the signal fluorescence intensity as a function of the number of PCR cycle@en
126 INFO lowercase_definition STATO:0000170 IAO:0000115 the first quartile is a quartile which splits the 75 % of the data@en
127 INFO lowercase_definition STATO:0000173 IAO:0000115 homogeneity testing objective is the objective of a data transformation to test a null hypothesis that two or more sub-groups of a population share the same distribution of a single categorical variable. For example, do people of different countries have the same proportion of smokers to non-smokers@en
128 INFO lowercase_definition STATO:0000175 IAO:0000115 confidence interval calculation is a data transformation which determines a confidence interval for a given statistical parameter@en
129 INFO lowercase_definition STATO:0000176 IAO:0000115 t-statistic is a statistic computed from observations and used to produce a p-value in statistical test when compared to a Student's t distribution.@en
130 INFO lowercase_definition STATO:0000177 IAO:0000115 the beta distribution is a continuous probability distributions defined on the interval [0, 1] parametrized by two positive shape parameters, denoted by α and β, that appear as exponents of the random variable and control the shape of the distribution@en
131 INFO lowercase_definition STATO:0000180 IAO:0000115 standard normal distribution is a normal distribution with variance = 1 and mean=0@en
132 INFO lowercase_definition STATO:0000183 IAO:0000115 sphericity testing objective is a statistical objective of a data transformation which aims to test a null hypothesis of sphericity holds.@en
133 INFO lowercase_definition STATO:0000185 IAO:0000115 a 2 by n contingency table is a contingency table built for one dichotomous variable (a categorical variable with only 2 outcomes) and one polychotomous variable (a polychomotomous variable with at least 2 outcomes)@en
134 INFO lowercase_definition STATO:0000188 IAO:0000115 average log signal intensity is a data time which corresponds to the sum of 2 distinct logarithm base 2 transformed signal intensity, each corresponding to a distinct condition of signal acquisition, divided by 2.@en
135 INFO lowercase_definition STATO:0000191 IAO:0000115 a goodness of fit statistical test is a statistical test which aim to evaluate if a sample distribution can be considered equivalent to a theoretical distribution used as input@en
136 INFO lowercase_definition STATO:0000192 IAO:0000115 a cartesian product is a data transformation which operates on a n Sets to produce a set of all possible ordered n-tuples where each element of the tuple comes from a Set
137 INFO lowercase_definition STATO:0000193 IAO:0000115 is a population whose individual members realize (may be expressed as) a combination of inclusion rule values specifications or resulting from a sampling process (e.g. recruitment followed by randomization to group) on which a number of measurements will be carried out, which may be used as input to statistical tests and statistical inference.
138 INFO lowercase_definition STATO:0000194 IAO:0000115 self explanatory@en
139 INFO lowercase_definition STATO:0000197 IAO:0000115 a genomic coordinate system is a coordinate system to describe position of sequence on a genomic scaffold (assembly of chromosome, contig....)@en
140 INFO lowercase_definition STATO:0000198 IAO:0000115 a statistical test which makes no assumption about the underlying data distribution@en
141 INFO lowercase_definition STATO:0000199 IAO:0000115 the Mauchly's test for sphericity is a statistical test which evaluates if the variance of the differences between all combinations of the groups are equal, a property known as 'sphericity' in the context of repeated measures. It is used for instance prior to repeated measure ANOVA. The test works by assessing if a Wishart-distributed covariance matrix (or transformation thereof) is proportional to a given matrix.@en
142 INFO lowercase_definition STATO:0000200 IAO:0000115 the statistical test power is data item which is about a statistical test and is obtained by subtracting the false negative rate (type II error rate) to 1. The power of a statistical test is the probability that it will correctly lead to the rejection of a false null hypothesis (Greene 2000). The statistical power is the ability of a test to detect an effect, if the effect actually exists (High 2000).@en
143 INFO lowercase_definition STATO:0000202 IAO:0000115 within subject comparison statistical test is a kind of statistical test which evaluates if a change occurs within one experimental unit over time following a treatment or an event@en
144 INFO lowercase_definition STATO:0000203 IAO:0000115 a cohort is a study group population where the members are human beings which meet inclusion criteria and undergo a longitudinal design@en
145 INFO lowercase_definition STATO:0000204 IAO:0000115 the F-distribution is a continuous probability distribution which arises in the testing of whether two observed samples have the same variance.@en
146 INFO lowercase_definition STATO:0000207 IAO:0000115 a planned process which etablishes and states the different hypothesis to be evaluated during a null hypothesis statistical test@en
147 INFO lowercase_definition STATO:0000209 IAO:0000115 area under curve is a measurement datum which corresponds to the surface define by the x-axis and bound by the line graph represented in a 2 dimensional plot resulting from an integration or integrative calculus. The interpretation of this measurement datum depends on the variables plotted in the graph@en
148 INFO lowercase_definition STATO:0000210 IAO:0000115 is a data item formed by dividing the fluorescence intensity obtained in one channel to that obtained in the other channel, typically the case when considering 2-color microarray data when imaging is done for Cy3 and Cy5 dyes.@en
149 INFO lowercase_definition STATO:0000211 IAO:0000115 odds ratio homogeneity hypothesis is a null hypothesis stating that all odds ratio are homogenous, that is remain within the same range.@en
150 INFO lowercase_definition STATO:0000212 IAO:0000115 a tetrachoric correlation coefficient is a polychoric correlation coefficient for 2 dichotomous variables used as proxy for correlation between 2 continuous latent variables.@en
151 INFO lowercase_definition STATO:0000213 IAO:0000115 discretization as a processing converting a continuous variable into a polychotomous variable by concretizing a set of discretization rules@en
152 INFO lowercase_definition STATO:0000214 IAO:0000115 a confidence interval which covers 50% of the sampling distribution, meaning that there is a 50% risk of false positive (type I error)@en
153 INFO lowercase_definition STATO:0000215 IAO:0000115 probit regression model is a model which attempts to explain data distribution associated with *ordinal* response/dependent variable in terms of values assumed by the independent variable uses a function of predictor/independent variable(s): the function used in this instance of regression modeling is the ordered probit function.@en
154 INFO lowercase_definition STATO:0000216 IAO:0000115 a stratum population is a population resulting from a population stratification prior to sampling process which aims to produce homogenous subpopulations from an heterogeneous population by applying one or more stratification criteria@en
155 INFO lowercase_definition STATO:0000217 IAO:0000115 a null hypothesis which states that a given matrix is proportional to a Wishart-distributed covariance matrix@en
156 INFO lowercase_definition STATO:0000219 IAO:0000115 a real time pcr standard curve is a line graph which plots the fluorescence intensity signal as a function of the concentration of a sample used as reference and used to determine relative abundance of test samples@en
157 INFO lowercase_definition STATO:0000220 IAO:0000115 the false negative rate is a data item which denotes the proportion of missed detection of elements known to be meeting the detection criteria@en
158 INFO lowercase_definition STATO:0000221 IAO:0000115 a random variable (or aleatory variable or stochastic variable) in probability and statistics, is a variable whose value is subject to variations due to chance (i.e. randomness, in a mathematical sense)@en
159 INFO lowercase_definition STATO:0000222 IAO:0000115 graeco-latin square design is_a study design which allows in its simpler form controlling 3 levels of nuisance variables (also known as blocking variables). The 3 nuisance factors are divided into a tabular grid with the property that each row and each column receive each treatment exactly once.@en
160 INFO lowercase_definition STATO:0000223 IAO:0000115 group assignment based on blocking variable specification is a kind of group assignment process which takes into account the levels assumed by a blocking variable to allocate subjects or experimental units to a treatment group@en
161 INFO lowercase_definition STATO:0000227 IAO:0000115 a normal distribution is a continuous probability distribution described by a probability distribution function described here: http://mathworld.wolfram.com/NormalDistribution.html@en
162 INFO lowercase_definition STATO:0000228 IAO:0000115 ordinal variable is a categorical variable where the discrete possible values are ordered or correspond to an implicit ranking@en
163 INFO lowercase_definition STATO:0000230 IAO:0000115 the expected value (or expectation, mathematical expectation, EV, mean, or the first moment) of a random variable is a data item which corresponds to the weighted average of all possible values that this random variable can take on. The weights used in computing this average correspond to the probabilities in case of a discrete random variable, or densities in case of a continuous random variable. From a rigorous theoretical standpoint, the expected value is the integral of the random variable with respect to its probability measure.@en
164 INFO lowercase_definition STATO:0000231 IAO:0000115 a confidence interval which covers 95% of the sampling distribution, meaning that there is a 5% risk of false positive (type I error). If the number of observations made is large enough, the sampling distribution can be assumed to be normal, which entails that 95% of the sampling distributions falls within roughly2 (1.96) standard deviations from the mean.@en
165 INFO lowercase_definition STATO:0000232 IAO:0000115 number of PCR cycle is a count which enumerates how many iterations of 'annealing, renaturation, amplification,' rounds (or cycles) are performed during a polymerase chain reaction (PCR) or an assay relying on PCR.@en
166 INFO lowercase_definition STATO:0000233 IAO:0000115 sensitivity is a measurement datum qualifying a binary classification test and is computed by substracting the false negative rate to the integral numeral 1@en
167 INFO lowercase_definition STATO:0000234 IAO:0000115 a residual is a data item which is the output of an error estimate or model fitting process and which is an observable estimate of the unobservable error@en
168 INFO lowercase_definition STATO:0000236 IAO:0000115 the coefficient of variation is a normalized measure of dispersion of a probability distribution of frequency distribution.@en
169 INFO lowercase_definition STATO:0000238 IAO:0000115 high content screening is a kind of investigation which uses a standardized cellular assays to test the effect of substances (RNAi or small molecules) held in libraries on a cellular phenotype. it relies on microscopy imaging and or flow-cytometry, robotic handling to ensure fast and high-throughput.@en
170 INFO lowercase_definition STATO:0000239 IAO:0000115 high throughput screening is a kind of investigation which uses a standardized assays (cell based, enzymatic or chemometric) to test the effect of substances (RNAi or small molecules) held in libraries on a very specific and measureable outcome (e.g fluorence intensity). it relies on robotic handling to ensure fast and high-throughput in assay performance, data acquisition and hit selection.@en
171 INFO lowercase_definition STATO:0000242 IAO:0000115 statistical error is an data item denoting the amount by which an observation differs from the expected value, being based on the whole statistical population from which the statistical unit was chosen randomly@en
172 INFO lowercase_definition STATO:0000243 IAO:0000115 a box plot is a graph which plots datasets relying on their quartiles and the interquartile range to create the box and the whiskers.@en
173 INFO lowercase_definition STATO:0000244 IAO:0000115 (Rn +) − (Rn −), where Rn + = (emission intensity of reporter dye)/(emission intensity of passive reference dye) in PCR with template and Rn − = (emission intensity of reporter dye)/(emission intensity of passive reference dye) in PCR without template or early cycles of a real-time reaction. Ct = threshold cycle, i.e., cycle at which a statistically significant increase in ΔRn is first detected@en
174 INFO lowercase_definition STATO:0000247 IAO:0000115 odds ratio homogeneity test is a statistical test which aims to evaluate that null the hypothesis of consistency odds ratio accross different strata of population is true or not@en
175 INFO lowercase_definition STATO:0000248 IAO:0000115 a blocking variable is a independent variable which is used in a blocking process part of an experiment with the purpose of maximizing the signal coming from the main variable.
176 INFO lowercase_definition STATO:0000249 IAO:0000115 a DNA microarray hybridization is an assay relying on nucleic acid hybridization , which uses a DNA microarray device and a nucleic acid as input. It precedes a data acquisition process@en
177 INFO lowercase_definition STATO:0000250 IAO:0000115 group comparison objective is a data transformation objective which aims to determine if 2 or more study group differ with respect to the signal of a response variable@en
178 INFO lowercase_definition STATO:0000252 IAO:0000115 a categorical variable is a variable which that can only assume a finite number of value and cast observation in a small number of categories@en
179 INFO lowercase_definition STATO:0000253 IAO:0000115 the objective of a data transformation to test a null hypothesis of absence of difference within subject holds.@en
180 INFO lowercase_definition STATO:0000255 IAO:0000115 the objective of a data transformation to test a null hypothesis of absence of difference withing subject holds.@en
181 INFO lowercase_definition STATO:0000256 IAO:0000115 a manhattan plot for gwas is a kind of scatter plot used to facilitate presentation of genome-wide association study (GWAS) data. Genomic coordinates are displayed along the X-axis, with the negative logarithm of the association P-value for each single nucleotide polymorphism displayed on the Y-axis.@en
182 INFO lowercase_definition STATO:0000258 IAO:0000115 a variable is a data item which can assume any of a set of values, either as determined by an agent or as randomly occuring through observation.@en
183 INFO lowercase_definition STATO:0000259 IAO:0000115 the relationship between a fraction and the number below the line (or divisor)@en
184 INFO lowercase_definition STATO:0000260 IAO:0000115 repeated measure ANOVA is a kind of ANOVA specifically developed for non-independent observations as found when repeated measurements on the sample experimental unit. repeated measure ANOVA is sensitive to departure from normality (evaluation using Bartlett's test), more so in the case of unbalanced groups (i.e. different sizes of sample populations). Departure from sphericity (evaluation using Mauchly'test) used to be an issue which is now handled robustly by modern tools such as R's lme4 or nlme, which accommodate dependence assumptions other than sphericity.@en
185 INFO lowercase_definition STATO:0000264 IAO:0000115 a factor level combination is one a possible sets of factor levels resulting from the cartesian product of sets of factor and their levels as defined in a factorial design@en
186 INFO lowercase_definition STATO:0000267 IAO:0000115 grouped bar chart is a kind of bar chart which juxtaposes the discrete values for each of the possible value of a given categorical variable, thus providing within group comparison. Grouped bar charts are good for comparing between each element in the categories, and comparing elements across categories. However, the grouping can make it harder to tell the difference between the total of each group.@en
187 INFO lowercase_definition STATO:0000269 IAO:0000115 polychoric correlation coefficient is a correlation coefficient which is computed over 2 variables to characterise an association by proxy with 2 (latent) variables which are assumed to be continuous and normally distributed.@en
188 INFO lowercase_definition STATO:0000270 IAO:0000115 a full factorial design is a factorial design which ensures that all possible factor level combinations are defined and used so all between group differences can be explored@en
189 INFO lowercase_definition STATO:0000271 IAO:0000115 permutation numbering is a data tranformation allowing to count the number of possible permutations of elements in a set of size n, each element occurring exactly once. This number is factorial n.@en
190 INFO lowercase_definition STATO:0000274 IAO:0000115 receiver operational characteristics curve is a graphical plot which illustrates the performance of a binary classifier system as its discrimination threshold (aka cut-off point) is varied by plotting sensitivity vs (1 − specificity)@en
191 INFO lowercase_definition STATO:0000277 IAO:0000115 hit selection is a planned process which in screening processes such as high-throughput screening, lead to the identification of perturbing agent which cause the typical signal generated by a standardized assay to significantly differ from the negative control. The selection hitself results from meeting or exceeding selection threshold (for instance 6 sigma from the mean or SSMD value beyond 5 when compared to positive controls or below -5 when compared to negative controls@en
192 INFO lowercase_definition STATO:0000278 IAO:0000115 pairing rule is a rule which specifies the criteria for deciding on how to associated any 2 entities.@en
193 INFO lowercase_definition STATO:0000279 IAO:0000115 between group comparison statistical test is a statistical test which aims to detect difference between the means computing for each of the study group populations@en
194 INFO lowercase_definition STATO:0000281 IAO:0000115 a false positive rate whose value is 1 per cent@en
195 INFO lowercase_definition STATO:0000283 IAO:0000115 negative binomial probability distribution is a discrete probability distribution of the number of successes in a sequence of Bernoulli trials before a specified (non-random) number of failures (denoted r) occur. The negative binomial distribution, also known as the Pascal distribution or Pólya distribution, gives the probability of r-1 successes and x failures in x+r-1 trials, and success on the (x+r)th trial.@en
196 INFO lowercase_definition STATO:0000285 IAO:0000115 hypergeometric test is a null hypothesis test which evaluates if a random variable follows a hypergeometric distribution. It is a test of goodness of fit to that distribution. The test is suited for situation aimed at assessing cases of sampling from a finite set without replacements. For instance, testing for enrichment or depletion of elements (e.g GO categories, genes)@en
197 INFO lowercase_definition STATO:0000286 IAO:0000115 a one-tailed test is a statistical test which, assuming an unskewed probability distribution, allocates all of the significance level to evaluate only one hypothesis to explain a difference. The one-tailed test provides more power to detect an effect in one direction by not testing the effect in the other direction. one-tailed test should be preceded by two-tailed test in order to avoid missing out on detecting alternate effect explaining an observed difference.@en
198 INFO lowercase_definition STATO:0000287 IAO:0000115 a two tailed test is a statistical test which assess the null hypothesis of absence of difference assuming a symmetric (not skewed) underlying probability distribution by allocating half of the significance level selected to each of the direction of change which could explain a difference (for example, a difference can be an excess or a loss).@en
199 INFO lowercase_definition STATO:0000289 IAO:0000115 a design matrix is an information content entity which denotes a study design. The design matrix is a n by m matrix where n the number of rows, corresponds to the number of observations (4 rows if quadruplicates) and where m, the number of columns corresponds to the number of independent variables. Each element in the matrix correspond to a discretized value representing one of the factor levels for a given factor. A design matrix can be used as input to statistical modeling or statistical analysis. The design matrix contains data on the independent variables (also called explanatory variables) in statistical models which attempt to explain observed data on a response variable (often called a dependent variable) in terms of the explanatory variables. The theory relating to such models makes substantial use of matrix manipulations involving the design matrix: see for example linear regression. A notable feature of the concept of a design matrix is that it is able to represent a number of different experimental designs and statistical models, e.g., ANOVA, ANCOVA, and linear regression@en