IPA-Best Practices for Expression Data Analysis

来源:百度文库 编辑:神马文学网 时间:2024/04/29 06:31:03

Best Practices for Expression Data Analysis

If you have a large dataset to analyze, involving several hundred to tens of thousands of molecules, you will want to run an analysis appropriate to your interests, this includes Core Analysis, IPA-Tox Analysis, or IPA-Metabolomics (see Types of Analyses).  Depending on the Analysis Type you choose, the results provide information on how your dataset overlaps molecules associated with various diseases and cellular functions, and that are part of canonical pathways.  These results often give a good indication of what cellular processes your dataset is related to and can lead to further investigation of these relationships.  In addition, you can view molecular networks that show how the significant molecules in your dataset are known to interact with one another and other closely interacting molecules.

 

With this in mind, we suggest the following best practices for expression data analysis, so that you get the most meaningful results out of IPA.

 

1. Make sure your dataset is formatted properly.

  • One column should have the identifiers you plan to use in your IPA analysis.

  • There should be only one row for the header. (You might need to delete extra rows from a raw dataset file.)

 

2. If you have replicate samples, use MS Excel or another statistical package to calculate the averages and p-values before uploading your data into IPA.

 

3. When creating the analysis, set a cutoff value for each expression value type used. For large datasets, we recommend using both a p-value and another expression value type.

 

4. Check the number of "Molecules Eligible for Network Generation". For the best results, you should have approximately <800. This ensures that  you obtain the most meaningful results from your IPA analysis.  If you have too many molecules eligible for analysis, this is equivalent to asking the question, "What does the genome do?" Instead, you should narrow your focus to those molecules that change, are statistically significant, or resemble some sort of pattern.  If you have >800 genes eligible for network generation, try increasing the stringency of your cutoff values or adding some filters. Use the Recalculate button to refresh the screen.