-
Six IATGs (EGFR, ANXA5, CLEC4M, CD209, UVRAG, and CACNA1C) were retrieved from the VThunter database (URL: https://db.cngb.org/VThunter/). Selected “Orthomyxoviridae” from the “Virus Family” dropdown menu and “Influenza A virus” from the “Virus” dropdown menu, then the target genes of Influenza A virus were shown in the “Target Gene” dropdown menu. The VThunter database is an up-to-date and accessible database specifically created to examine and analyze the manifestations of viral receptors in the tissues of various animal species at the single-cell level. This database identified 107 viral receptors in 142 viral species and acquired accurate expression profiles using 285 scRNA-seq datasets, which cover 2,100,962 cells from 47 distinct animal species[23].
-
We procured human kidney renal cell carcinoma cell lines (786-O, ACHN, and Caki-1) and HK2, a proximal tubular cell line derived from the normal kidney from the American Type Culture Collection (ATCC, VA, USA). The primers were synthesized by Sangon Biotech (Shanghai, China). Cells were cultured and collected for qRT-PCR analysis as described previously[24,25]. Briefly, 50,000 cells from each of the cell lines indicated above were plated in 6-well plates. Total RNA was collected and isolated from cell cultures after the cells reached confluence using the TRIzol reagent (Invitrogen, NY, USA). iScript cDNA synthesis reagent (Bio-Rad, Hercules, CA, USA) was used to synthesize cDNA. β-actin was utilized as an internal control. The primer sequences are listed below.
-
Public databases were accessed to acquire tumor sample-related data from The Cancer Genome Atlas (TCGA) (https://portal.gdc.cancer.gov/)[26], and mRNA sequencing, clinical, single-nucleotide variants (SNV), copy number variants (CNV), and methylation data from the GSCA database (http://bioinfo.life.hust.edu.cn/GSCA)[27]. The details have been described previously[24-25]. Reverse phase protein array (RPPA) data retrieved from The Cancer Proteome Atlas (TCPA) database were used for pathway analysis[28]. The correlation between gene expression and drug sensitivity was based on the Genomics of Drug Sensitivity in Cancer (GDSC) database[29]. Samples collected from thirty-three cancer types were included in pan-cancer analysis. Detailed cancer types and cases are listed in Supplementary Table S1 (available in www.besjournal.com).
Gene Forward sequence Reverse sequence EGFR TTGCCGCAAAGTGTGTAACG GTCACCCCTAAATGCCACCG CACNA1C AATCGCCTATGGACTCCTCTT GCGCCTTCACATCAAATCCG CLEC4M GAGTAACCGCTTCTCCTGGATG CGCACAGTCTTCATTCCCGCTA ANXA5 AACCCTCTCGGCTTTATGATGC CGCTGGTAGTACCCTGAAGTG CD209 TCAAGCAGTATTGGAACAGAGGA CAGGAGGCTGCGGACTTTTT UVRAG CTTGGGTCAGCAGATTCATGC CATCGTAAGAATTGCGAACACAG Table 1. Primers
Cancer type (Abbreviation) Cases Adrenocortical carcinoma (ACC) 92 Breast cancer (BRCA) 1,218 Bladder uroepithelial carcinoma (BLCA) 411 Cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC) 310 Cholangiocarcinoma (CHOL) 45 Colon adenocarcinoma (COAD) 329 Lymphoid neoplasm diffuse large B-cell lymphoma (DLBC) 48 Head and neck squamous cell carcinoma (HNSC) 566 Esophageal carcinoma (ESCA) 196 Glioblastoma multiforme (GBM) 174 Kidney chromophobe (KICH) 91 Kidney renal clear cell carcinoma (KIRC) 606 Kidney renal papillary cell carcinoma (KIRP) 323 Acute myeloid leukemia (LAML) 173 Brain lower grade glioma (LGG) 534 Liver hepatocellular carcinoma (LIHC) 359 Lung adenocarcinoma (LUAD) 576 Thyroid cancer (THCA) 572 Thymoma (THYM) 122 Uterine corpus endometrial carcinoma (UCEC) 201 Uterine carcinosarcoma (UCS) 57 Uveal melanoma (UVM) 80 Table S1. Cancer types and cases
-
Only fourteen cancer types were included (COAD, ESCA, LUSC, KIRC, HNSC, PRAD, BRCA, BLCA, THCA, STAD, KIRP, LUAD, LIHC, and KICH) in the mRNA expression analysis. The criterion was that the paired tumor and normal samples collected from the list of cancer types were more than ten. The mRNA expression values in TCGA are expressed as normalized RSEM values. Fold change was calculated as mean (tumor)/mean (normal), as described previously[24,25]. Moreover, the tumor samples were classified into two groups (high and low) based on the median values and further analyzed for the correlation between expression and survival.
-
SNV and clinical survival data from thirty-three cancers were extracted from TCGA database. Using the unique barcoding of each specimen, SNV and clinical survival data were merged. Mutated tumor specimens were identified based on the presence of certain mutated genes. For the survival analysis, at least two groups with two or more samples were included. The survival rate of the R package was used to match the survival time and status. Differences in survival between the wild-type and mutant groups were determined using the Cox proportional hazards model and log-rank test. Eight mutation types were included in the analysis: deleterious mutations, missense mutations, nonsense mutations, frame-shift insertions, splice-sites, frame-shift deletions, in-frame deletions, and in-frame insertions.
-
CNV data from thirty-three cancer types were collected from TCGA and analyzed using GISTICS 2.0[30]. The GISTIC database was used to identify significantly altered regions of amplification or deletion in the patient cohorts. This study explored the copy number level of each gene in the gene set in each pan-cancer cancer based on the GISTIC score derived from GISTIC and then summarized the four types of GISTIC scores: homozygous deletion, heterozygous deletion, heterozygous amplification, and homozygous amplification. Spearman’s correlation analysis was performed by merging the mRNA expression data with raw CNV data[31]. The FDR was used to adjust the P-value. A log-rank test was performed to examine the differences in survival between groups. SNV data and clinical survival data were merged using specimen barcoding. For the survival analysis, at least two groups with two or more samples were included. The R package for survival was used to fit survival time and status within each group. A log-rank test was performed to test survival differences between the groups.
-
Methylation analysis was performed based on the chosen fourteen cancer types with more than ten paired tumors and adjacent normal tissues. Differences in methylation levels between tumor and normal samples were determined using Student’s t-test. Spearman analysis was used to determine the correlation between the mRNA expression and methylation levels of the genes. Median methylation data were used for survival analysis after categorizing tumor samples into two groups (hypermethylated and hypomethylated). The FDR was also used to adjust the P-value.
-
Calculations were performed on ten cancer-related cell signaling pathways for thirty-three cancer types, including the TSC-mTOR pathway, receptor tyrosine kinases (RTKs), Ras/Raf/MAPK pathway, PI3K-AKT pathway, hormone estrogen receptors (ERs), androgen receptor (AR), EMT, DNA damage, cell cycle, and apoptotic pathways[32]. The activity scores of the listed pathways and gene expression between pathways (activation and repression) were determined using the median pathway scores[33].
First, all included data were grouped into high and low expression groups based on median gene expression values. A t-test was used to determine the difference in pathway activity scores (PAS) between the groups, and FDR was used to adjust the P-value. When PAS of the group with gene A in high expression was greater than PAS of the other group with gene A in low expression, gene A might activate a verified signaling pathway; otherwise, it might have a repressive effect on this pathway[33].
-
Gene set enrichment analysis (GSEA) calculations were performed using the R package fgsea[34]. By normalizing the enrichment scores (ES), GSEA considered the differences in the sizes of the IATG gene sets and correlations within the expression datasets. NES was used to compare the results of the analyses across gene sets.
-
The IC50 values of 265 selected compounds with PubChem ID and the corresponding gene expression data in the GDSC2 dataset were collected from 860 tumor cell lines from the Genomics of Drug Sensitivity in Cancer database (GDSC; URL: https://www.cancerrxgene.org/). Compounds without valid PubChem ID were excluded from the study. These compounds are cytotoxic chemotherapeutics and targeted therapeutics that are acquired from commercial sources, academic researchers, and biopharmaceutical companies. The pathways targeted by these compounds included ABL signaling, apoptosis regulation, cell cycle, chromatin histone acetylation/methylation, cytoskeleton, kinases, DNA replication, EGFR signaling, ERK/MAPK signaling, genome integrity, hormone-related pathways, IGF1R signaling, immune response, JNK and p38 signaling, metabolism, mitosis, and other unclassified pathways. Pearson correlation analysis was performed to determine the relationship between the mRNA expression of the gene and the IC50 of the drug. The P-value was adjusted using FDR. A positive correlation implies that high gene expression suggests drug resistance.
-
The ImmuCellAI algorithm was used to calculate the infiltration of 24 immune cell lines and expressed as a correlation coefficient[35]. The association of immune cell infiltration with the GSVA scores of IATGs was analyzed using Spearman’s correlation with P-value adjusted by FDR. A set of marker genes for three immune-related pathways, including chemotactic cytokines, the MHC class I antigen presentation pathway, and immunostimulators, was obtained from the TISIDB database[36]. The relationship between IATGs and the three immune-related pathways was analyzed using the GEPIA2 database (Pearson’s coefficient)[37].
-
RNA-sequencing expression profiles and corresponding clinical information for IATGs in KIRC were downloaded from the TCGA dataset, while those for normal control group were downloaded from GTEx database (https://gtexportal.org/home/)[38]. The test for differential expression of genes was performed using the Wilcoxon rank sum test.
-
Unless otherwise stated, all statistical analyses were performed using the GraphPad Prism (version 8.0.1) and R software (version 4.0.2). Correlation analysis was conducted using the Spearman’s correlation test. Survival risk and HR were calculated using the Cox proportional risk model. The “survival” R program was used to examine the two groups’ survival time and survival status. The log-rank test was used for comparative analysis. The rank sum test identified data from both groups, and a P-value < 0.05 or FDR ≤ 0.05 was considered statistically significant. Genes and cancer types with a P-value of less than 0.05 were shown. The significance of the differences between the two subgroups was evaluated using the Mann–Whitney U test (n < 5). One-way ANOVA and Bonferroni’s post hoc tests were used to perform multiple comparisons. P < 0.05 was deemed statistically significant. Independent qRT-PCR analyses were performed in triplicates.
Multi-omics Approach Reveals Influenza-A Virus Target Genes Associated Genomic, Clinical and Immunological Characteristics in Cancers
doi: 10.3967/bes2024.094
- Received Date: 2023-10-18
-
Key words:
- Genomic changes /
- Immune microenvironment /
- Prognosis /
- Drug resistance /
- Experimental validation
Abstract:
The authors declare that there is no conflict of interests.
&These authors contributed equally to this work.
Citation: | WANG Jiao Jiao, LIAO Yong, YANG Ping Lian, YE Wei Le, LIU Yong, Xiao CHUN Xia, LIAO Wei Xiong, CHEN Chun Bo, LIU Zhi Ping, HUANG Zun Nan. Multi-omics Approach Reveals Influenza-A Virus Target Genes Associated Genomic, Clinical and Immunological Characteristics in Cancers[J]. Biomedical and Environmental Sciences. doi: 10.3967/bes2024.094 |