Background: Worldwide, gastric cancer remains the fifth most prevalent type of cancer and is the third leading cause of cancer-related deaths. Gastric cancer is responsible for around 7% of global cancer occurrence and approximately 9% of annual cancer-related mortalities. Objective: The aim of the study was to analyze gastric cancer dataset posted on Kaggle (https://www.kaggle.com/datasets/datasetengineer/gastric-cancer-gc-dataset). Methods: This dataset comprises 212354 participants, of whom 10% had gastric cancer. This dataset was analyzed to extract the information regarding gastric cancer. The analysis of data was performed SPSS version 25. Descriptive analysis was used to describe data including frequency and percentages to describe categorical variables such as age, and the mean and standard deviation to describe non-categorical variables such as age. The relationships between variables and gastric cancer were calculated based on Chi-Square and One Way ANOVA tests, significance was considered if p value ≤0,05. Results: The mean age was 53.2580±18.98 years. Seventy percent of participants were males. About 30% of participants had a family history of gastric cancer. About 40% of participants were smokers. About 50% of participants were alcoholic. About 75% of participants had Helicobacter pylori. 80% of participants were at high salt intake. About 50% of participants had chronic gastritis. Abnormal endoscopic image reports were reported in approximately 30% of participants. Biopsy results were negative in 90% of participants. The reports of CT scans were negative in approximately 80%. Genetic mutations were detected in Tp53 (50.1%), KRAS (20%), and CDH1 (29,9%). No significant relationships were found between gastric cancer and study variables. Conclusion: Most of the people had risk factors such as Helicobacter pylori infection, salt intake, and mutations in TP53, KRAS, and CDH1. However, statistical analyses did not find significant correlations between those and gastric cancer.
Key words: Gastric cancer, dataset, Kaggle, KRAS, Tp53, CHD1.
|