Hasil Pencarian

Ditemukan 211516 dokumen yang sesuai dengan query

Kesia Gabriele

Komparasi Metode SMOTE, SMOTE-ENN, dan SMOTE-CUT dalam Menangani Imbalanced Data pada Klasifikasi Multi-Kelas dengan Support Vector Machine (SVM) = Comparative Analysis of SMOTE, SMOTE-ENN, and SMOTE-CUT in Multi-Class SVM Classification for Imbalanced Data

"Support Vector Machine (SVM) merupakan model klasifikasi yang dikenal dengan keakuratan klasifikasi yang tinggi. Namun, Support Vector Machine (SVM) menghasilkan hasil klasifikasi yang kurang optimal jika data yang digunakan tidak seimbang (imbalanced data). Terdapat beberapa cara dalam menangani data yang tidak seimbang, salah satunya dengan metode resampling. Metode resampling sendiri terbagi dalam dua pendekatan yaitu over-sampling dan under-sampling. Salah satu pendekatan over-sampling yang popular adalah Synthetic Minority Over-sampling Technique (SMOTE). SMOTE bekerja dengan membangkitkan sampel sintetis pada kelas minoritas. Untuk meningkatkan kinerja model, SMOTE dapat digabungkan dengan pendekatan under-sampling seperti Edited Nearest Neighbors (ENN) dan Cluster-based Undersampling Technique (CUT). Dalam kombinasinya dengan SMOTE, ENN berperan sebagai cleaning untuk menghapus data sintetis dari penerapan SMOTE yang tidak relevan dan dianggap sebagai noise. Sementara, CUT beperan dalam mengidentifikasi sub-kelas dari kelas mayoritas untuk menekan angka over-sampling sekaligus meminimalisir hilangnya informasi penting pada kelas mayoritas selama proses undersampling. Kombinasi over-sampling dan under-sampling ini saling melengkapi dan mengatasi kekurangan dari masing-masing metode. Penelitian ini memfokuskan perbandingan performa metode resampling SMOTE beserta variasinya, yaitu SMOTEENN dan SMOTE-CUT dalam mengklasifikasikan data multi-kelas yang tidak seimbang menggunakan Support Vector Machine. Dari analisis yang dilakukan, diperoleh kesimpulan bahwa SMOTE-CUT cenderung menghasilkan performa klasifikasi yang lebih baik dibandingkan dengan SMOTE ataupun SMOTE-ENN. Walaupun demikian, keseluruhan metode resampling (SMOTE, SMOTE-ENN, dan SMOTE-CUT) mampu meningkatkan kinerja dari model klasifikasi Support Vector Machine (SVM).

Support Vector Machine (SVM) is popular classfier that is known for its high accuracy value. However, Support Vector Machine (SVM) may not perform well on imbalanced datasets. There are several ways to handle imbalanced data, one of them is through resampling methods. Resampling methods itself divided into two approaches, oversampling and under-sampling. One of the popular over-sampling methods is Synthetic Minority Over-sampling Technique (SMOTE). SMOTE works by generating synthetic samples for the minority class. SMOTE can be combined with under-sampling methods such as Edited Nearest Neighbors (ENN) or Cluster-based Under-sampling Technique (CUT). In combination with SMOTE, ENN acts as a cleaning role to remove synthetic data generated from SMOTE application that is not relevant and considered as noise. Meanwhile, CUT plays a role in identifying sub-class form the majority class to reduce over-sampling while minimizing the loss of important information in the majority class during the under-sampling process. The combination of over-sampling and undersampling is needed to complement and overcome the weakness of each method. This research mainly focuses on comparing the performance of the resampling method SMOTE and its variations, SMOTE-ENN and SMOTE-CUT, in classifying multi-class imbalanced data using Support Vector Machine. From the analysis conducted, it was concluded that data with resampling SMOTE-CUT shows better classification performance compare to data with resampling SMOTE or SMOTE-ENN. However, any resampling method (SMOTE, SMOTE-ENN, and SMOTE-CUT) can handle imbalanced data and improve Support Vector Machine performance."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2024

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Valery Ongso Putri

Analisis Perbandingan Metode SMOTE, SMOTE-ENN, dan SMOTE-Tomek Link Dalam Menangani Imbalanced Data pada Klasifikasi = Comparative Analysis of SMOTE, SMOTE-ENN, and SMOTE-Tomek Link Methods in Handling Imbalanced Data in Classification

"Ketidakseimbangan data merupakan masalah umum yang terjadi dalam bidang analisis data. Data menjadi tidak seimbang karena terdapat perbedaan antara jumlah sampel pada setiap kelasnya. Masalah ketidakseimbangan ini menyebabkan model klasifikasi menjadi bias, dimana model akan cenderung memprediksi kelas mayoritas secara efektif dibandingkan dengan kelas minoritas dan dapat menyebabkan kesalahan interpretasi dalam pengambilan suatu keputusan. Terdapat beberapa cara dalam menangani data yang tidak seimbang, yaitu random undersampling dan random oversampling. Salah satu metode dari random oversampling yang populer adalah Synthetic Minority Oversampling Technique (SMOTE). SMOTE dapat digabungkan dengan metode random undersampling, yaitu Edited Nearest Neighbors (ENN) dan Tomek link. Pada metode gabungan SMOTE-ENN dan SMOTE-Tomek link, SMOTE bekerja terlebih dahulu dengan membuat sampel sintetis pada kelas minoritas. ENN dan Tomek link berperan sebagai cleaning untuk menghapus data yang tidak relevan dan dianggap sebagai noise. Untuk melihat pengaruh ketiga metode resampling tersebut, yaitu SMOTE, SMOTEENN, dan SMOTE-Tomek Link, dilakukan simulasi data. Simulasi data dapat melihat pengaruh ukuran sampel, ukuran proporsi kelas, dan metode resampling terhadap model klasifikasi decision tree, random forest, dan XGBoost pada data yang tidak seimbang. Simulasi data juga dijalankan sebanyak 100 iterasi yang menunjukkan bahwa iterasi pertama cukup untuk mewakili hasil dari 100 iterasi. Hasil menunjukkan bahwa ketiga metode cenderung mampu memberikan hasil yang baik dengan adanya peningkatan nilai metrik precision, recall, ROC-AUC, dan G-Mean. Metode SMOTE dengan XGBoost bekerja dengan baik pada ukuran sampel kecil dengan adanya peningkatan nilai metrik yang cukup signifikan. Pada SMOTE-ENN, nilai recall cenderung meningkat yang diikuti oleh menurunnya nilai precision pada proporsi 1:9, 2:8, dan 3:7 dengan sampel yang relatif kecil. SMOTE-Tomek Link juga meningkatkan nilai metrik pada sampel yang relatif kecil dengan proporsi memberikan nilai metrik tertinggi.

Data imbalance is a common problem that occurs in the field of data analysis. The data becomes unbalanced because there is a difference between the number of samples in each class. This imbalance problem causes the classification model to be biased, where the model will tend to predict the majority class effectively compared to the minority class and can cause misinterpretation in making a decision. There are several ways to handle imbalanced data, namely random undersampling and random oversampling. One of the popular random oversampling methods is Synthetic Minority Over-sampling Technique (SMOTE). SMOTE can be combined with random undersampling methods, namely Edited Nearest Neighbors (ENN) and Tomek link. In the combined SMOTE-ENN and SMOTE-Tomek link method, SMOTE works first by creating a synthetic sample in the minority class. ENN and Tomek link act as cleaning to remove irrelevant data and are considered as noise. To see the effect of the three resampling methods, namely SMOTE, SMOTE-ENN, and SMOTE-Tomek Link, data simulation was conducted. Data simulation can see the effect of sample size, class proportion size, and resampling method on decision tree, random forest, and XGBoost classification models on imbalanced data. The data simulation was also run for 100 iterations which shows that the first iteration is sufficient to represent the results of 100 iterations. The results show that the three methods tend to be able to provide good results with an increase in the precision, recall, ROC-AUC, and G-Mean metric values. The SMOTE method with XGBoost works well on small sample sizes with a significant increase in metric values. In SMOTE-ENN, the recall value tends to increase followed by a decrease in precision value at proportions 1:9, 2:8, and 3:7 with relatively small samples. SMOTE-Tomek Link also increases the metric value on relatively small samples with proportions of 1:9 and 2:8. In addition, the resampling method was also used on data available on Kaggle.com, namely Pima Indian Diabetes and Give Me Some Credit:: 2011 Competition. In the Pima Indian Diabetes data, it can be seen that the recall, ROC-AUC, and G-Mean values are the highest using SMOTE-ENN with the XGBoost model. On the Give Me Some Credit:: 2011 Competition also shows that the SMOTE-ENN method with the XGBoost model provides the highest metric value."

Depok: Fakultas Matematika Dan Ilmu Pengetahuan Alam Universitas Indonesia, 2023

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Jeffri Ivander

Kombinasi algoritma support vector machine (SVM) dan analisis multi-attribute ABC pada klasifikasi inventori indirect material di perusahaan otomotif = Combination of support vector machine (SVM) and multi attribute ABC analysis on indirect material inventory classification in otomotive industry/ Jeffri Ivander

"ABSTRAK

Klasifikasi indirect material pada industri otomotif tempat penelitian ini

dilakukan belum dapat dilakukan dengan baik sehingga diharapkan dengan

menggunakan metode multi-attribute ABC dan support vector machine performa

klasifikasi indirect material dapat ditingkatkan. Multi-attribute ABC digunakan

untuk melakukan klasifikasi berdasarkan kriteria klasifikasi dengan bobot yang

dihitung dengan analytic hierarchy process , dan support vector machine

digunakan untuk menemukan pola hubungan antara kriteria dengan hasil

klasifikasi dan melakukan penilaian performa klasifikasi . Hasil akhir dari

penelitian ini menunjukkan bahwa kriteria harga dan kritikalitas merupakan

kriteria paling berpengaruh terhadap hasil klasifikasi dan terjadi peningkatan

performa klasifikasi setelah memanfaatkan metode ini

ABSTRACT

Indirect material classification on the automotive industry where the research was

done is not being done properly so it is expected that by using multi-attribute

ABC and support vector machine the classification performance could be

enhanced. Multi-attribute ABC is used to classify the item based on the criterion

and weight that is calculated using analytic hierarchy process , and support vector

machine is used to find hidden pattern about the criterion and classification result

and assess classification performance. The end results of this research show that

price and criticallity are the most influental criterion for the classification results

and there is classification performance enhancement after using these methods."

2014

S56119

UI - Skripsi Membership Universitas Indonesia Library

Theresia Veronika Rampisela

Klasifikasi data skizofrenia dengan support vector machines dan twin support vector machines = Classification of schizophrenia data using support vector machines and twin support vector machines

"Skizofrenia adalah gangguan jiwa yang serius dan kronis. Penyakit ini ditandai dengan gangguan dalam pemikiran, persepsi, dan tingkah laku. Karena gangguan-gangguan ini dapat memicu penderita Skizofrenia untuk bunuh diri atau mencoba bunuh diri, penderita Skizofrenia mempunyai usia harapan hidup yang lebih rendah dari populasi umum. Skizofrenia juga sulit untuk didiagnosis karena belum ada tes secara fisik untuk mendiagnosisnya dan gejala-gejalanya sangat mirip dengan beberapa gangguan jiwa lainnya. Dengan menggunakan Northwestern University Schizophrenia Data, penelitian ini bertujuan untuk mengklasifikasikan orang yang menderita Skizofrenia dan orang yang tidak menderita Skizofrenia. Data tersebut terdiri dari 392 observasi dan 65 variabel yang merupakan data demografis dan data kuesioner Scale for the Assessment of Positive Symptoms dan Scale for the Assessment of Negative Symptoms yang diisi oleh klinisi. Metode klasifikasi yang digunakan adalah machine learning dengan metode Support Vector Machines SVM dan Twin Support Vector Machines Twin SVM menggunakan MATLAB R2017a. Simulasi dilakukan dengan data dan persentase data training dan testing yang berbeda-beda. Pada setiap simulai, akurasi serta running time diukur. Validasi dan evaluasi performa dari model yang telah dioptimasi dilakukan dengan mengambil rata-rata dari sepuluh kali Hold-Out Validation yang dilakukan. Pada umumnya, metode Twin SVM berhasil mengklasifikasikan data Skizofrenia dengan lebih akurat dibandingkan dengan metode SVM. Metode Twin SVM dengan kernel Gaussian menghasilkan hasil akhir akurasi klasifikasi data Skizofrenia yang terbaik, yaitu 91,0 . Berdasarkan hasil akhir running time, metode SVM dengan kernel Gaussian untuk klasifikasi data Skizofrenia mempunyai running time yang paling cepat, 0,664 detik. Selain itu, metode SVM dengan kernel linear, metode SVM dengan kernel Gaussian, dan metode Twin SVM untuk klasifikasi data Skizofrenia berhasil mencapai akurasi hingga 95,0 dalam setidaknya satu simulasi.

Schizophrenia is a severe and chronic mental disorder. This disorder is marked with disturbances in thoughts, perceptions, and behaviours. Due to these disturbances that can trigger Schizophrenics to commit suicide or attempt to do so, Schizophrenics have a lower life expectancy than the general population. Schizophrenia is also difficult to diagnose as there is no physical test to diagnose it yet and its symptoms are very similar to several other mental disorders. Using Northwestern University Schizophrenia Data, this research aims to distinguish people who are Schizophrenics and people who are not. The data consists of 392 observations and 65 variables that are demographic data as well as clinician filled Scale for the Assessment of Positive Symptoms and Scale for the Assessment of Negative Symptoms questionnaires. Classification methods that are used are machine learning with Support Vector Machines SVM and Twin Support Vector Machine Twin SVM using MATLAB R2017a. Simulations are done with different data and percentage of training and testing data. In each simulation, accuracy and running time are measured. Performance validation and evaluation of the optimized models are done by taking the average of ten times Hold Out Validations that were done. In general, Twin SVM successfully classified Schizophrenia data more accurately than the SVM method. Twin SVM with Gaussian kernel produced the best final accuracy in classifying Schizophrenia data, 91.0 . Based on the final running time, SVM with Gaussian kernel has the fastest running time in classifying Schizophrenia data, 0.664 seconds. Furthermore, SVM with linear kernel, SVM with Gaussian kernel, and Twin SVM managed to reach an accuracy of 95.0 in at least one simulation in classifying Schizophrenia data."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2018

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

Woro Sudaryanti

Sistem identifikasi pembicara berbahasa Indonesia menggunakan support vector machine (SVM)

"Penelitian ini melakukan studi mengenai sistem identifikasi pembicara berbahasa Indonesia menggunakan SVM. Parameter sistem terdiri atas silence removal, PCA, nilai rata-rata dan varians MFCC. Ujicoba menggunakan data berita berbahasa Indonesia dari televisi dan radio yang disegmen dalam 5, 10, 15 detik dengan jumlah data 26 jam (715 pembicara).

Hasil penelitian ini menunjukkan ketepatan pengenalan pembicara sebesar 94-98% untuk kombinasi parameter silence removal dan rata-rata MFCC dengan akurasi terbaik pada segmen waktu 10 detik. Namun dengan bertambahnya jumlah pembicara, ketepatan pengenalan cenderung berkurang. Penelitian ini dapat dikembangkan untuk sistem perolehan informasi data speech berdasarkan siapa yang berbicara dalam suatu sesi data.

This research studies speaker identification system for Indonesian speech based on SVM. Parameters of this system are silence removal, PCA, average and varians values of MFCC. The experiments use 26 hours (715 speakers) Indonesian broadcast news from radio and television segmented into 5, 10, 15 seconds.
The results achieve 94-98% identification accuracy for combination of parameters silence removal and average of MFCC. The best accuracy comes from 10 seconds time segment. However, the accuracy falls when the number of speakers increases. This study could be used for speech retrieval system based on who speaks in a speech session."

Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2009

T-Pdf

UI - Tesis Open Universitas Indonesia Library

Dilla Fadlillah Salma

Analisis akurasi metode support vector machine, random forest, dan logistic regression dalam mengklasifikasi data asuransi mobil dengan implementasi metode seleksi fitur one dimensional naive bayes classifier = Accuracy analysis of support vector machine, random forest, and logistic regression method in classifying car insurance data with one dimensional naive bayes classifier features selection implementation

"Kepemilikan dan penggunaan kendaraan mobil memiliki berbagai risiko negatif, seperti terjadinya kecelakaan. Untuk mengurangi beban risiko tersebut, perusahaan menjual produk asuransi mobil. Asuransi mobil merupakan salah satu produk perusahaan asuransi kendaraan yang bertujuan sebagai upaya perlindungan pemilik kendaraan mobil dari kerugian finansial yang terjadi pada kendaraan yang diasuransikannya. Untuk menawarkan produk asuransi, beberapa perusahaan menggunakan teknik penjualan dengan cara cold calling. Teknik penjualan tersebut akan lebih efektif menjual produk asuransi jika terlebih dahulu data nasabah calon pembeli asuransi diprediksi atau diklasifikasi ke dalam kelas membeli atau tidak membeli.
Pada skripsi ini, dilakukan klasfikasi dengan metode Support Vector Machine (SVM), Random Forest (RF),dan Logistic Regression (LR) dengan implementasi metode seleksi fitur One Dimensional NaÃ¯ve Bayes Classifier (1-DBC). Data yang diperoleh berjumlah 4000 data dengan total 18 fitur. Diperoleh hasil bahwa akurasi SVM lebih tinggi dibandingkan dengan kedua metode lainnya. Selain itu, mplementasi metode seleksi fitur telah berhasil meningkatkan akurasi dari metode Random Forest, dan Logistic Regression. Dengan implementasi 1-DBC, ketiga metode klasifikasi memperoleh hasil akurasi tertinggi pada penggunaan 15 fitur.
Ownership and use of car vehicles have a variety of negative risks, such as accidents. To reduce the risk burden, the company sells car insurance products. Car insurance is one of the products of a vehicle insurance company that aims to protect vehicle owners from financial losses that occur on their insured vehicles. To offer insurance products, some companies use sales techniques using cold calling. The sales technique will be more effective in selling insurance products if first the prospective customer buyer data is predicted or classified into the class of buying or not buying.
In this paper, classification is done using the method of Support Vector Machine (SVM), Random Forest (RF), and Logistic Regression (LR) by implementing the One Dimensional NaA-ve Bayes Classifier (1-DBC) feature selection method. The data obtained amounted to 4000 data with a total of 18 features. The results were obtained that the accuracy of SVM was higher compared to the other two methods. In addition, the implementation of the feature selection method has succeeded in increasing the accuracy of the Random Forest, and Logistic Regression. With the implementation of 1-DBC, the three classification methods obtained the highest accuracy results with the use of 15 features."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2018

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

Cari yang mirip

Tambahkan ke Favorit

Metadata PDF

Abstrak PDF

Abstrak

Muhammad Nur Ichsan

Analisis Kinerja Model Support Vector Machine dalam Mengklasifikasi Tingkat Keparahan Penyakit Pestalotiopsis sp. pada Data Citra Daun Karet Menggunakan Fitur Warna dan Jumlah Bintik = Performance Analysis of Support Vector Machine Model in Classifying the Severity of Pestalotiopsis sp. Disease on Rubber Leaf Image Data Using Color and Number of Spots Features

"Saat ini, Indonesia menempati peringkat kedua sebagai produsen karet terbesar di dunia, menyumbang sekitar 29,8% dari kebutuhan global. Namun, produksi karet di Indonesia mengalami penurunan dari tahun ke tahun, salah satu faktornya adalah serangan penyakit gugur daun yang disebabkan oleh jamur Pestalotiopsis sp. Pada tahun 2021, luas perkebunan karet yang terkena penyakit mencapai 30.328,84 hektar dan tanaman yang terinfeksi oleh penyakit tersebut mengalami penurunan produksi lateks hingga 30%. Penyakit ini menyerang daun dengan gejala pembentukan bercak berukuran 0,5-2 cm yang menyebabkan nekrosis dan gugur. Penklasifikasian tingkat keparahan penyakit Pestalotiopsis sp. secara morfologi melalui pengamatan jumlah bintik dan warna pada daun karet membutuhkan waktu dan tenaga besar, terutama karena luasnya perkebunan yang terinfeksi. Oleh karena itu, penggunaan metode machine learning diusulkan untuk mengurangi waktu dan usaha yang dibutuhkan dalam menklasifikasi penyakit gugur daun akibat jamur Pestalotiopsis sp. Pada penelitian ini, model machine learning digunakan untuk mengklasifikasi 5 kelas tingkat keparahan penyakit Pestalotiopsis sp. yaitu tingkat 0 (sehat), tingkat 1 (terinfeksi ringan), tingkat 2 (terinfeksi sedang), tingkat 3 (terinfeksi parah), dan tingkat 4 (terinfeksi sangat parah). Dataset yang digunakan adalah citra daun tanaman karet yang diperoleh dari Pusat Penelitian Karet Sembawa. Model machine learning menerima input data citra daun tanaman karet, lalu citra disegmentasi menggunakan k-mean clustering. Data yang telah tersegmentasi kemudian diekstraksi dengan fitur warna hue, saturation, dan value (HSV) dan fitur jumlah bintik dengan metode contour detection menggunakan Suzuki’s contour algorithm. Selanjutnya, fitur-fitur ini diklasifikasikan menggunakan Support Vector Machine (SVM) tipe one vs rest multiclass classification dan Grid Search Cross Validation dengan 5 fold untuk menemukan hyperparameter terbaik untuk SVM. Hyperparameter terbaik adalah kernel radial basis function dengan C=100. Berdasarkan hasil percobaan sebanyak 5 kali, diperoleh kesimpulan bahwa model dengan akurasi tertinggi adalah model yang menggunakan fitur warna dan jumlah bintik dengan nilai rata-rata akurasi sebesar 81,86% dan nilai rata-rata Cohen’s kappa statistic sebesar 0,77 yang artinya model mampu mengklasifikasi data citra daun tanaman karet dengan cukup baik.
Currently, Indonesia ranks as the second largest rubber producer in the world, contributing about 29.8% of global demand. However, rubber production in Indonesia has decreased from year to year, one of the factors is the attack of leaf fall disease caused by the fungus Pestalotiopsi sp. In 2021, the area of rubber plantations affected by the disease reached 30,328.84 hectares with infected plants have a 30% decrease in latex production. The disease attacks the leaves with symptoms of spot formation measuring 0.5-2 cm which causes necrosis and fall. Detecting the severity of Pestalotiopsis sp. morphologically through the observation of the number of spots and colors on rubber leaves requires a lot of time and energy, especially due to the large area of infected plantations. Therefore, the use of machine learning methods is proposed to reduce the time and effort required in classifying leaf fall disease caused by the fungus Pestalotiopsis sp. In this study, a machine learning model is used to classify 5 classes of Pestalotiopsis sp. disease severity, namely level 0 (healthy), level 1 (mild infected), level 2 (moderate infected), level 3 (severe infected), and level 4 (very severe infected). The dataset used is an image of rubber plant leaves obtained from the Sembawa Rubber Research Center. The machine learning model received input data of rubber plant leaf images, then the image is segmented using k-mean clustering. The segmented data will then be extracted with hue, saturation, and value (HSV) color features and the number of spots feature with the contour detection method using Suzuki’s contour algorithm. In this study, the performance evaluation used is accuracy and Cohen's kappa statistic. Furthermore, these features are classified using Support Vector Machine (SVM) type one vs rest multiclass classification and Grid Search Cross Validation with 5 folds to find the best hyperparameter for SVM. The best hyperparameter is the radial basis function kernel with C=100. Based on the results of 5 experiments, it is concluded that the model with the highest accuracy is a model that uses color and the number of spots features with an average accuracy value of 81.86% and an average Cohen's kappa statistic value of 0.77, which means that the model is able to classify rubber plant leaf image data quite well."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2024

S-pdf

UI - Skripsi Membership  Universitas Indonesia Library

Cari yang mirip

Tambahkan ke Favorit

Metadata PDF

Abstrak PDF

Abstrak

Rafiqatul Khairi

Klasifikasi Kanker Pankreas menggunakan Kernel-based Support Vector Machine = Pancreatic Cancer Classification using Kernel-based Support Vector Machine

"Kanker pankreas adalah penyakit di mana sel-sel tumor ganas (kanker) berkembang di jaringan pankreas, yaitu organ di belakang perut bagian bawah dan di depan tulang belakang, yang membantu tubuh menggunakan dan menyimpan energi dari makanan dengan memproduksi hormon untuk mengontrol kadar gula darah dan enzim pencernaan untuk memecah makanan. Biasanya, kanker pankreas jarang terdeteksi pada tahap awal. Salah satu tanda seseorang mengalami kanker pankreas adalah diabetes, terutama jika itu bertepatan dengan penurunan berat badan yang cepat, penyakit kuning, atau rasa sakit di perut bagian atas yang menyebar ke punggung. Di antara berbagai jenis kanker, kanker pankreas memiliki tingkat kelangsungan hidup terendah, yaitu hanya sekitar 3-6% dari mereka yang didiagnosis yang dapat bertahan hidup selama lima tahun. Jika pasien didiagnosis tepat waktu untuk perawatan, peluang mereka untuk bertahan hidup akan meningkat. Terdapat penanda tumor yang biasa digunakan untuk mengikuti perkembangan kanker pankreas, yaitu CA 19-9 yang dapat diukur dalam darah. Orang sehat dapat memiliki sejumlah kecil CA 19-9 dalam darah mereka. Kadar CA 19-9 yang tinggi seringkali merupakan tanda kanker pankreas. Tetapi kadang-kadang, kadar tinggi dapat menunjukkan jenis kanker lain atau gangguan non-kanker tertentu, seperti sirosis dan batu empedu. Karena kadar CA 19-9 yang tinggi tidak spesifik untuk kanker pankreas, CA 19-9 tidak dapat digunakan dengan sendirinya untuk skrining atau diagnosis. Ini dapat membantu memantau perkembangan kanker dan efektivitas pengobatan kanker. Dalam studi ini, metode Kernel-based Support Vector Machine digunakan untuk mengklasifikasikan hasil tes darah CA19-9 menjadi dua bagian; data pasien yang didiagnosis dengan kanker pankreas atau pasien normal (tidak terdiagnosis kanker pankreas). Metode ini memperoleh akurasi sekitar 95%.
Pancreatic cancer is a disease in which malignant (cancerous) tumor cells develop in pancreatic tissue; organ behind the lower abdomen and in front of the spine, which helps the body use and store energy from food by producing hormones to control blood sugar levels and digestive enzymes to break down food. Usually, pancreatic cancer is rarely detected at an early stage. One sign of a person with pancreatic cancer is diabetes, especially if it coincides with rapid weight loss, jaundice, or pain in the upper abdomen that spreads to the back. Among various types of cancer, pancreatic cancer has the lowest survival rate of only about 3-6% of those diagnosed who can survive for five years. If patients are diagnosed on time for treatment, their chances of survival will increase. There is a tumor marker commonly used to follow the course of pancreatic cancer, namely CA 19-9 which can be measured in the blood. Healthy people can have small amounts of CA 19-9 in their blood. High levels of CA 19-9 are often a sign ofÂ pancreatic cancer. But sometimes, high levels can indicate other types of cancer or certain noncancerous disorders, includingÂ cirrhosisÂ andÂ gallstones. Because a high level of CA 19-9 is not specific for pancreatic cancer, CA 19-9 cannot be used by itself for screening or diagnosis. It can help monitor the progress of your cancer and the effectiveness of cancer treatment. In this study, the Kernel-based Support Vector Machine method is used to classify CA19-9 blood test results into two sections including data on patients diagnosed with pancreatic cancer or normal patients. This method will get an accuracy of around 95%."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2020

S-pdf

UI - Skripsi Membership  Universitas Indonesia Library

Cari yang mirip

Tambahkan ke Favorit

Metadata PDF

Abstrak PDF

Abstrak

Febrisa Dhewi Ramadhany

Klasifikasi thalassemia menggunakan support vector machines (SVM) dan multi-layer perceptron (MLP) = Classification of thalassaemia using support vector machines (SVM) and multi-layer perceptron (MLP)

"ABSTRACT
Thalassemia merupakan salah satu penyakit kelainan sel darah merah yang diturunkan oleh orang tua sejak lahir. Thalassemia mengakibatkan protein yang ada di dalam sel darah merah rusak dan tidak mampu berfungsi dengan baik. Hingga saat ini penyakit thalassemia belum dapat disembuhkan, namun penyakit thalassemia dapat dicegah dengan melakukan deteksi dini atau tes prenatal yang dikenal dengan skrining. Pada penelitian ini deteksi dini dilakukan dengan bantuan komputer. Ada beberapa teknik yang telah digunakan untuk mengklasifikasi skrining data thalassemia, salah satu metode yang mampu mengklasifikasi penyakit thalassemia diantaranya adalah Support Vector Machines (SVM) dan Multi-Layer Perceptron (MLP). Data thalassemia yang digunakan diperoleh dari RSAB Harapan Kita, Indonesia. Data tersebut memiliki yang memiiki 10 fitur. Setelah pengujian dilakukan, klasifikasi dengan menggunakan metode SVM menunjukkan hasil akurasi lebih baik sebesar 97,47190988% dengan rata-rata running time 0,145899875 detik. Sedangkan MLP memperoleh hasil akurasi terbaik sebesar 63,91% dengan rata-rata running time 0,009033 detik. Kesimpulan yang diperoleh menunjukkan bahwa teknik klasifikasi menggunakan SVM memiliki akurasi yang lebih baik apabila dibandingkan dengan MLP.
ABSTRACT
Thalassaemia is a red blood cell disorder that is inherited by parents from birth. Thalassaemia results in damaged proteins in red blood cells and are unable to function properly. Until now, thalassaemia has not been cured, but thalassaemia can be prevented by early detection or prenatal testing known as screening. In this study, early detection is done with the help of a computer. There are several techniques that have been used to classify thalassaemia data screening, one method that is able to classify thalassaemia include Support Vector Machines (SVM) and Multi-Layer Perceptron (MLP). The thalassaemia data used was obtained from Harapan Kita Hospital, Indonesia. The data has 10 features. After the testing is done, the classification using the SVM method shows better accuracy results of 97.447190988% with an average running time of 0.145899875 seconds. While MLP obtained the best accuracy results of 63.91% with an average running time of 0.009033 seconds. The conclusions obtained showed that the classification technique using SVM had better accuracy compared to MLP."

2018

S-Pdf

UI - Skripsi Membership  Universitas Indonesia Library

Cari yang mirip

Tambahkan ke Favorit

Metadata PDF

Abstrak PDF

Abstrak

Rinawati

Studi eksperimental learning to rank dengan menggunakan metode Ranking SVM

"Pesatnya perkembangan jumlah halaman web memotivasi banyak pihak untuk membangun suatu search engine dengan kinerja yang optimal. Proses ranking merupakan bagian penting dalam alur kerja suatu search engine. Salah satu metode alternatif machines learning yang cukup mendapatkan perhatian para peneliti adalah metode ranking SVM. Metode pembelajaran pada ranking SVM berupa model linear yang bertujuan mendapatkan fungsi ranking berdasarkan ide dasar SVM (Support Vector Machines). Studi eksperimental ini bertujuan mengukur kinerja metode ranking SVM pada data LETOR. Data LETOR merupakan data yang diorganisir oleh Microsoft yang ditujukan untuk pembelajaran ranking (leraning to rank). Hasil eksperimen menunjukkan bahwa akurasi MAP (Mean Average Precision) metode ranking SVM pada data LETOR adalah sebesar 47.38%. Hal ini menunjukkan bahwa persoalan ranking merupakan persoalan yang masih bersifat tantangan sehingga diperlukan penelitian lanjutan yang akan memberikan akurasi yang lebih tinggi.
Fast growth of web pages motivates many people to build an optimal search engine. Ranking process is an important part in the workflow of a search engine. One alternative method of machines learning which attracting more researchers? attention is a ranking SVM method. Ranking SVM has a learning system in a linear model form. Its aims to get a ranking function based on the basic idea of SVM (Support Vector Machines). This experimental study aims to measure the performance of SVM ranking methods in LETOR. LETOR benchmark dataset is organized by Microsoft. It have been released to facilitate the research on learning to rank.. The experimental results show that MAP (Mean Average Precision) accuracy of ranking SVM method on LETOR is 47.38%. This shows that the ranking is a challenging issue and required further research to provide higher accuracy."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2012

T31855

UI - Tesis Open  Universitas Indonesia Library

Cari yang mirip

Tambahkan ke Favorit

Metadata PDF

Abstrak PDF

Abstrak

<< 1 2 3 4 5 6 7 8 9 10 >>

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian