Comparison of Distance Models on K-Nearest Neighbor Algorithm in Stroke Disease Detection

Authors

  • Iswanto Iswanto Universitas Sumatera utara
  • Tulus Tulus Universitas Sumatera Utara
  • Poltak Sihombing Universitas Sumatera Utara

DOI:

https://doi.org/10.33086/atcsj.v4i1.2097

Keywords:

euclidean, minkowski, chebyshev, K-Nearest Neighbor, manhattan

Abstract

Stroke is a cardiovascular (CVD) disease caused by the failure of brain cells to get oxygen supply to pose a risk of ischemic damage and result in death. This Disease can detect based on the similarity of symptoms experienced by the sufferer so that early steps can be taking with appropriate counseling and treatment. Stroke detecting requires a machine learning method. In this research, the author used one of the supervised learning classification methods, namely K-Nearest Neighbor (K-NN). K-NN is a classification method based on calculating the distance to training data. This research compares the Euclidean, Minkowski, Manhattan, Chebyshev distance models to obtain optimal results. The distance models have been tested using the stroke dataset sourced from the Kaggle repository. Based on the test results, the Chebyshev model has the highest levels of accuracy compared to the other three distance models with an average accuracy value of 95.49%, the highest accuracy of 96.03%, at K = 10. The Euclidean and Minkowski distance models have the same level of accuracy at each K value with an average accuracy value of 95.45%, the highest accuracy of 95.93% at K = 10. Meanwhile, Manhattan has the lowest average compared to the other distance models, which is 95.42% but has the highest accuracy of 96.03% at the value of K = 6

Downloads

Download data is not yet available.

References

D. W. Nugraha, A. Y. E. Dodu, and N. Chandra, “Klasifikasi Penyakit Stroke Menggunakan Metode Naive Bayes Classifier (Studi Kasus Pada Rumah Sakit Umum Daerah Undata Palu),” semanTIK, vol. 3, no. 2, 2017.

V. Gerc, I. Masic, N. Salihefendic, and M. Zildzic, “Cardiovascular Diseases (CVDs) in COVID-19 Pandemic Era,” Mater. Socio Medica, vol. 32, no. 2, 2020, doi: 10.5455/msm.2020.32.158-164.

I. A. Angreni, S. A. Adisasmita, M. I. Ramli, and S. Hamid, “Pengaruh Nilai K Pada Metode K-Nearest Neighbor (KNN) Terhadap Tingkat Akurasi Identifikasi Kerusakan Jalan,” Rekayasa Sipil Mercu Buana, vol. 7, no. 2, pp. 63–70, 2018.

W. Wahyono, I. N. P. Trisna, S. L. Sariwening, M. Fajar, and D. Wijayanto, “Perbandingan penghitungan jarak pada k-nearest neighbour dalam klasifikasi data tekstual,” J. Teknol. dan Sist. Komput., vol. 8, no. 1, pp. 54–58, 2020.

M. Shouman, T. Turner, and R. Stocker, “Applying k-Nearest Neighbour in Diagnosing Heart Disease Patients,” Int. J. Inf. Educ. Technol., 2012, doi: 10.7763/ijiet.2012.v2.114.

L. Y. Hu, M. W. Huang, S. W. Ke, and C. F. Tsai, “The distance function effect on k-nearest neighbor classification for medical datasets,” Springerplus, vol. 5, no. 1, 2016, doi: 10.1186/s40064-016-2941-7.

T. Kirdat and V. V Patil, “Application of Chebyshev distance and Minkowski distance to CBIR using color histogram,” Int. J. Innov. Res. Technol., vol. 2, no. 9, pp. 28–31, 2016.

D. Sinwar and R. Kaushik, “Study of Euclidean and Manhattan distance metrics using simple k-means clustering,” Int. J. Res. Appl. Sci. Eng. Technol, vol. 2, no. 5, pp. 270–274, 2014.

P. Mulak and N. Talhar, “Analysis of distance measures using k-nearest neighbor algorithm on kdd dataset,” Int. J. Sci. Res., vol. 4, no. 7, pp. 2101–2104, 2015.

A. M. Argina, “Penerapan Metode Klasifikasi K-Nearest Neigbor pada Dataset Penderita Penyakit Diabetes,” Indones. J. Data Sci., vol. 1, no. 2, pp. 29–33, 2020.

Downloads

Published

2021-07-31

How to Cite

Iswanto, I., Tulus, T., & Sihombing, P. (2021). Comparison of Distance Models on K-Nearest Neighbor Algorithm in Stroke Disease Detection. Applied Technology and Computing Science Journal, 4(1), 63–68. https://doi.org/10.33086/atcsj.v4i1.2097

Issue

Section

Articles