Hasil Pencarian

Ditemukan 125031 dokumen yang sesuai dengan query

Muhammad Anwar Farihin

Pengenalan Entitas Bernama pada Twit Berbahasa Indonesia Menggunakan Model Pre-Trained BERT = BERT Pre-Trained Language Model for Named Entity Recognition on Indonesian Tweets

"Pengenalan Entitas Bernama (NER) telah diteliti cukup dalam, khususnya pada korpus berbahasa Inggris. Namun, penelitian NER pada korpus twit berbahasa Indonesia masih sangat sedikit karena minimnya dataset yang tersedia secara publik. BERT sebagai salah satu model state-of-the-art pada permasalahan NER belum diimplementasikan pada korpus twit berbahasa Indonesia. Kontribusi kami pada penelitian ini adalah mengembangkan dataset NER baru pada korpus twit berbahasa Indonesia sebanyak 7.426 twit, serta melakukan eksperimen pada model CRF dan BERT pada dataset tersebut. Pada akhirnya, model terbaik pada penelitian ini menghasilkan nilai F1 72,35% pada evaluasi tingkat token, serta nilai F1 79,27% (partial match) dan 75,40% (exact match) pada evaluasi tingkat entitas.

Named Entity Recognition (NER) has been extensively researched, primarily for understanding the English corpus. However, there has been very little NER research for understanding Indonesian-language tweet corpus due to the lack of publicly available datasets. As one of the state-of-the-art models in NER, BERT has not yet been implemented in the Indonesian-language tweet corpus. Our contribution to this research is to develop a new NER dataset on the corpus of 7.426 Indonesian-language tweets and to conduct experiments on the CRF and BERT models on the dataset. In the end, the best model of this research resulted in an F1 score of 72,35% at the token level evaluation and an F1 score of 79,27% (partial match) and 75,40% (exact match) at the entity level evaluation."

Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2021

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Farah Ulfah Amanda

Analisis kesesuaian Pengendalian Risiko pada Tingkat Entitas PT JZZ Terhadap COSO Integrated Framework 2013 = Analysis of Compatability of Risk Control on Entity Level Control in PT JZZ Toward COSO Integrated Framework 2013

"Laporan magang ini membahas tentang analisis atas kesesuaian aktivitas pengendalian risiko pada tingkat entitas PT JZZ . Analisis dilakukan dengan cara memetakan ulang (re-mapping) dan membandingkan kesesuaian aktivitas pengendalian risiko tingkat entitas terhadap The Committe of Sponsoring

Organization of the Treadway Commission (COSO) Integrated Framework 2013.

Hasil analisis menunjukan bahwa aktivitas pengendalian risiko tingkat entitas

milik PT JZZ cukup bagus, tetapi masih menggunakan pedoman COSO Integrated Framework 1992 sehingga diperlukan perbaikan.

This internship report discusses about the analysis of suitability of risk control at the entity level in PT JZZ. The analysis is carried out by re-mapping and
comparing the suitabilty of risk control activities at entity level to The Committe
of Sponsoring Organization of the Treadway Commission (COSO) Integrated
Framework 2013. The analysis result show that PT JZZ’s entity level risk control
activities is fairly good, but still use the COSO Integrated Framework 1992, so it
needs some improvement."

Depok: Fakultas Ekonomi dan Bisnis Universitas Indonesia, 2019

TA-pdf

UI - Tugas Akhir Universitas Indonesia Library

Batini, Carlo

Conceptual database design : an entity-relationship approach

California : Benjamin/Cummings, 1992

005.74 BAT c

Buku Teks SO Universitas Indonesia Library

Alif Mahardhika

Identifikasi Ujaran Kebencian dan Ujaran Kasar pada Twit Berbahasa Campuran Indonesia-Jawa dengan Pre-Trained Language Model Berbasis BERT = Hate-Speech and Abusive Language Identification on Code-Mixed Indonesian and Javanese Language Tweets Using BERT-based Pre-trained Language Model

"Ujaran kasar dan ujaran kebencian telah menjadi fenomena yang banyak ditemukan di media sosial. Penyalahgunaan kebebasan berpendapat ini berpotensi memicu terjadinya konflik dan ketidakstabilan sosial dikalangan masyarakat, baik dalam interaksi sosial secara digital maupun secara fisik. Diperlukan upaya identifikasi ujaran kasar dan ujaran kebencian secara otomatis, akurat, dan efisien untuk mempermudah penegakkan hukum oleh pihak berwenang. Penelitian pada skripsi ini melakukan perbandingan performa klasifikasi ujaran kasar dan ujaran kebencian pada data teks mixed-coded berbahasa Indonesia-Jawa, menggunakan model klasifikasi berbasis BERT. Eksperimen perbandingan dilakukan dengan membandingkan pre-trained model berbasis BERT dengan berbagai arsitektur dan jenis berbeda, yaitu BERT (dengan arsitektur base dan large), RoBERTa (arsitektur base), dan DistilBERT (arsitektur base). Untuk mengatasi keterbatasan mesin dalam memahami teks mixed-coded, penelitian ini dirancang dalam dua skenario yang membandingkan performa klasifikasi pada teks mixed-coded Indonesia-Jawa dan teks mixed coded yang diterjemahkan ke Bahasa Indonesia. Hasil terbaik berdasarkan F1-Score didapatkan pada klasifikasi menggunakan model berbasis BERT dengan nama IndoBERT-large-p2 pada kedua skenario, dengan F1-Score 78,86% pada skenario tanpa proses translasi, dan F1-Score 77,22% pada skenario dengan proses translasi ke Bahasa Indonesia.

Hateful and abusive speech has become a phenomenon that becomes common in social media. This abuse of freedom of speech presents significant risk of starting social conflicts, be it in the form of digital or physical social interactions. An accurate, efficient, and automated hate speech and abusive language identification effort needs to be developed to help authorities address this problem properly. This research conducts a comparison on hate speech and abusive language identification using several BERT-based language models. The comparisons are made using a variety of BERT-based language models with different types and architecture, including BERT (base and large architecture), RoBERTa (base architecture), and DistilBERT (base architecture). To address the mixed-coded nature of social media texts, this research was conducted under two different scenario that compares the classification performance using a mixed-coded Indonesian-Javanese text and texts that have been translated to Indonesian. The best classification output was measured using F1-Score, with a BERT-based model named IndoBERT-large-p2 outscoring the other BERT-based models in both scenario, scoring an F1-Score of 78.86% in untranslated scenario, and 72.22% F1-Score on the Indonesian-translated scenario."

Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2021

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Ilma Alpha Mannix

Pencarian Dosen Pakar Menggunakan Pre-Trained Language Model BERT = Academic Expert Finding Using BERT Pre-Trained Language Model

"Penelitian ini bertujuan untuk menguji efektivitas pre-trained language model BERT pada tugas pencarian dosen pakar. Bidirectional Encoder Representations from Transformers (BERT) merupakan salah satu state-of-the-art model saat ini yang menerapkan contextual word representation (contextual embedding). Dataset yang digunakan pada penelitian ini terdiri dari data pakar dan bukti kepakaran. Data pakar merupakan data dosen Fakultas Ilmu Komputer Universitas Indonesia (Fasilkom UI). Data bukti kepakaran merupakan data abstrak digital tugas akhir mahasiswa Fasilkom UI. Model yang diusulkan pada penelitian ini terdiri dari tiga variasi BERT, yaitu IndoBERT (Indonesian BERT), mBERT (Multilingual BERT), dan SciBERT (Scientific BERT) yang akan dibandingkan dengan model baseline menggunakan word2vec. Terdapat dua pendekatan yang dilakukan untuk mendapatkan urutan dosen pakar pada variasi model BERT, yaitu pendekatan feature-based dan fine-tuning. Penelitian ini menunjukkan bahwa model IndoBERT dengan pendekatan feature-based memberikan hasil yang lebih baik dibandingkan baseline dengan peningkatan 6% untuk metrik MRR hingga 9% untuk metrik NDCG@10. Pendekatan fine-tuning juga memberikan hasil yang lebih baik pada model IndoBERT dibandingkan baseline dengan peningkatan 10% untuk metrik MRR hingga 18% untuk metrik P@5. Diantara kedua pendekatan tersebut, dibuktikan bahwa pendekatan fine-tuning memberikan hasil yang lebih baik dibandingkan dengan pendekatan feature-based dengan peningkatan 1% untuk metrik P@10 hingga 5% untuk metrik MRR. Penelitian ini menunjukkan bahwa penggunaan pre-trained language model BERT memberikan hasil yang lebih baik dibandingkan baseline word2vec dalam tugas pencarian dosen pakar.

This study aims to test the effectiveness of the pre-trained language model BERT on the task of expert finding. Bidirectional Encoder Representations from Transformers (BERT) is one of the current state-of-the-art models that applies contextual word representation (contextual embedding). The dataset used in this study consists of expert data and expertise evidence. The expert data is composed of faculty members from the Faculty of Computer Science, University of Indonesia (Fasilkom UI). The expertise evidence data consists of digital abstracts by Fasilkom UI students. The proposed model in this research consists of three variations of BERT, namely IndoBERT (Indonesian BERT), mBERT (Multilingual BERT), and SciBERT (Scientific BERT), which will be compared to a baseline model using word2vec. Two approaches were employed to obtain the ranking of expert faculty members using the BERT variations, namely the feature-based approach and fine-tuning. The results of this study shows that the IndoBERT model with the feature-based approach outperforms the baseline, with an improvement of 6% for the MRR metric and up to 9% for the NDCG@10 metric. The fine-tuning approach also yields better results for the IndoBERT model compared to the baseline, with an improvement of 10% for the MRR metric and up to 18% for the P@5 metric. Among these two approaches, it is proven that the fine-tuning approach performs better than the feature-based approach, with an improvement of 1% for the P@10 metric and up to 5% for the MRR metric. This research shows that the use of the pre-trained language model BERT provides better results compared to the baseline word2vec in the task of expert finding."

Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2023

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Rangkuti, Choirun Nisaa

Pemodelan dan perhitungan konduktivitas optis pada layered pr0 5ca1 5mno4 dengan dynamical mean field theory = Modelling and calculations of optical conductivity of layered pr0 5ca1 5mno4 within dynamical mean field theory

"Kami melakukan perhitungan konduktivitas optis pada layered (perovskite) Pr0.5Ca1.5MnO4untuk mengidentifikasi fenomena charge-ordering. Pemodelan melibatkan orbital Mn dan O yang berada pada bidang MnO2 dari layered Pr0.5Ca1.5MnO4. Interaksi yang diperhitungkan dalam pemodelan yaitu interaksi Coulomb inter-orbital dan intra-orbital, distorsi Jahn-Teller dan exchange interaction dengan menerapkan beberapa asumsi. Perhitungan dilakukan menggunakan Dynamical Mean Field Theory untuk mencapai self-consistency. Hasil perhitungan menunjukkan profile yang mendekati hasil eksperimen dengan puncak charge-ordering berada di bawah 1 eV dan puncak charge-transfer pada 3-3.7 eV. Di bawah temperatur TCO=OO ( 325 K), puncak charge-ordering mengalami blue shift seiring dengan penurunan temperatur.

We calculate the optical conductivity of layered (perovskite) Pr0.5Ca1.5MnO4 to capture charge-ordering phenomena. The calculations are based on a model which considers Mn and O orbitals within the MnO2 plane of layered Pr0.5Ca1.5MnO4. Interaction terms included in the model with some assumptions are the inter-orbital and intra-orbital Coulomb repulsions, the static Jahn-Teller distortion and the exchange interaction. We calculate within Dynamical Mean Field Theory to achieve self-consistency. The result shows a profile similar to recent experimental data, where the charge-ordering peak appears below 1 eV and charge-transfer peak at 3-3.7 eV. For temperaturelower than TCO=OO ( 325 K), the charge-ordering peak undergoes a blue shift as the temperature is decreased."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2014

S57092

UI - Skripsi Membership Universitas Indonesia Library

Putri Rizqiyah

Analisis Sentimen Vaksin COVID-19 Di Indonesia Menggunakan Model Bahasa XLMR Dan Teknik Augmentasi Data = Sentiment Analysis of COVID-19 Vaccine in Indonesia Using Pre-trained Language Model XLMR and Augmentation Data Methods

"Vaksinasi COVID-19 merupakan salah satu solusi jangka panjang untuk mengatasi pandemi COVID-19 di Indonesia. Topik vaksinasi COVID-19 menjadi perbincangan yang hangat, khususnya di media sosial. Berbagai macam pro dan kontra mengenai program vaksinasi terus bermunculan sehingga penelitian mengenai analisis publik terhadap program vaksinasi COVID-19 sangat berguna untuk komunikasi publik. Penelitian ini berfokus kepada lima jenis vaksin yang banyak digunakan di Indonesia yaitu, AstraZeneca, Moderna, Pfizer, Sinopharm dan Sinovac. Sebanyak 252,805 data dikumpulkan melalui media sosial twitter menggunakan Twitter API di tahun 2021. Lalu sebanyak 11,361 dipilih secara acak untuk dianotasi secara manual. Selanjutnya, proses klasifikasi dilakukan menggunakan model bahasa XLMR dan beberapa metode baseline berbasis pre-trained language model, deep learning, machine learning dan lexicon. Augmentasi data seperti Easy Data Augmentation (EDA), An Easier Data Augmentation (AEDA) dan Seqgan juga dilakukan untuk menyeimbangkan jumlah kelas data minoritas. Pembagian data latih dan data uji dilakukan dengan menggunakan dua metode yaitu simple random sampling dan stratified sampling untuk mengetahui performa model yang dilatih. Hasil penelitian menunjukkan bahwa metode yang diusulkan yaitu XLMR, memiliki performa yang tinggi dibandingkan metode baseline lainnya, dengan akurasi sebesar 71.91% sebelum dilakukan augmentasi dan 72.19% setelah dilakukan augmentasi menggunakan Seqgan menggunakan metode pembagian data simple random sampling. Lalu, dengan menggunakan metode pembagian data stratified, XLMR juga memiliki performa terbaik dengan akurasi 59.96% sebelum dilakukan augmentasi dan 74.37% setelah dilakukan augmentasi menggunakan EDA. Penelitian ini akan sangat bermanfaat untuk komunikasi publik dengan kasus serupa. Di masa yang akan datang, penelitian ini bisa dilanjutkan dengan melakukan domain transfer untuk meningkatkan performa model.

COVID-19 vaccination is one of the long-term solutions to address the COVID-19 pandemic in Indonesia. The topic of COVID-19 vaccination has become a hot discussion, especially on social media. Various pros and cons regarding the vaccination program continue to emerge, making research on public analysis of the COVID-19 vaccination program very useful for public communication. This study focuses on five types of vaccines widely used in Indonesia, namely AstraZeneca, Moderna, Pfizer, Sinopharm, and Sinovac. A total of 252,805 data were collected through social media Twitter using the Twitter API in 2021. Then, 11,361 were randomly selected to be manually annotated. Subsequently, the classification process was performed using the XLMR language model and several baseline methods based on pre-trained language models, deep learning, machine learning, and lexicon. Data augmentation such as Easy Data Augmentation (EDA), An Easier Data Augmentation (AEDA), and Seqgan was also carried out to balance the number of minority class data. The division of training data and test data was done using two methods, namely simple random sampling and stratified sampling, to determine the performance of the trained model. The results of the study show that the proposed method, XLMR, has high performance compared to other baseline methods, with an accuracy of 71.91% before augmentation and 72.19% after augmentation using Seqgan with the simple random sampling data splitting method. Then, using the stratified data splitting method, XLMR also had the best performance with an accuracy of 59.96% before augmentation and 74.37% after augmentation using EDA. This research will be very useful for public communication with similar cases. In the future, this research can be continued by conducting domain transfer to improve model performance."

Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2024

T-pdf

UI - Tesis Membership Universitas Indonesia Library

Rosalia

Model Regresi Varying Intercept = Varying Intercept Model

Model regresi varying intercept adalah salah satu model regresi yang diterapkan pada nested data, yaitu data yang terdiri dari beberapa grup dan setiap grupnya mengandung beberapa observasi individu. Terdapat beberapa karakteristik yang sering dijumpai pada nested data, yaitu adanya variansi antar grup dan obervasi-observasi individu yang berasal dari grup yang sama saling berkorelasi. Dengan mempertimbangkan error di dua tingkat, yaitu tingkat individu dan tingkat grup, model regresi varying intercept lebih sesuai untuk diterapkan pada nested data karena model regresi tersebut mengakomodir kedua karakteristik tersebut. Pada tugas akhir ini, dibahas model regresi varying intercept tanpa variabel prediktor dan dengan satu variabel prediktor. Model regresi tersebut mengandung beberapa parameter yang perlu ditaksir, yaitu koefisien regresi dan komponen variansi. Adapun efek acak, yaitu efek grup yang merupakan variabel acak pada model regresi tersebut yang perlu diprediksi. Metode penaksiran koefisien regresi pada model regresi varying intercept yang dibahas pada tugas akhir ini adalah Generalized Least Squares (GLS) dan Maximum Likelihood (ML) dengan algoritma Expectation-Maximization (EM). Efek acak pada model regresi varying intercept diprediksi dengan menggunakan Best Linear Unbiased Prediction (BLUP). Sedangkan, komponen variansi pada model regresi varying intercept ditaksir dengan menggunakan metode Maximum Likelihood (ML) dengan algoritma Expectation-Maximization (EM). Pada tugas akhir ini, simulasi dilakukan untuk mengetahui efek standar deviasi dari komponen error pada model regresi varying intercept dan efek banyaknya observasi individu di setiap grup terhadap standar deviasi dari komponen error. Hasil simulasi menunjukkan bahwa apabila nilai standar deviasi dari komponen error tingkat individu lebih besar dibandingkan nilai standar deviasi dari komponen error tingkat grup, pengelompokan observasi-observasi individu dapat diabaikan. Sebaliknya, apabila nilai standar deviasi dari komponen error tingkat individu lebih kecil atau sama dengan nilai standar deviasi error tingkat grup, pengelompokan observasi-observasi individu tidak dapat diabaikan. Hasil simulasi juga menunjukkan bahwa banyaknya observasi individu di setiap grup tidak berasosiasi dengan standar deviasi dari komponen error, baik standar deviasi dari komponen error di tingkat individu maupun standar deviasi dari komponen error di tingkat grup.

Varying intercept model is a regression model that is applied in nested data, which is data that consists of several groups and each group contains several individual observations. Several characteristics are often found in nested data, namely, the variance between groups and individual observations from the same group are correlated. By considering errors in two different levels, that is individual level and group level, varying intercept model is more suitable than the linear regression model in nested data because varying intercept model accommodates those characteristics. In this thesis, discussed varying intercept model without the predictor variable and varying intercept model with one predictor variable. The varying intercept model consists of several parameters that must be estimated, namely regression coefficients and variance components. There is also a random effect, which is a group effect which is a random variable. The regression coeficients are estimated using Generalized Least Squares (GLS) and Maximum Likelihood (ML) via the EM (Expectation-Maximization) Algorithm. The random effect in varying intercept model is predicted using Best Linear Unbiased Prediction (BLUP). On the other side, the variance components in varying intercept model are estimated using Maximum Likelihood via EM (Expectation-Maximization) Algorithm. In this thesis, simulation is done to analyze the effect of the standard deviation of the error components in varying intercept model and the effect of the number of individual observations in each group toward the standard deviation of the error components. The simulation results show that if the standard deviation of the error component in the individual level is larger than the standard deviation of the error component in the group level, then the classifications of individual observations into several groups should be ignored. On the other side, if the standard deviation of the error component in the individual level is smaller or equal to the standard deviation of the error component in the group level, then the classifications of individual observations into several groups should not be ignored. The simulation results also show that the number of individual observations in each group is not associated with the standard deviation of the error components.

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2020

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

Anton Hendranata

An econometric input-output model for Indonesia: economic impact analysis of budget development expenditure

"This study aims (1) to construct an Econometric Input-Output Model for Indonesia, that emphasizes the linkage between sectors, and (2) to analyze the impact of budget allocation on development expenditure to Indonesian's economy in 2002.
The model, constructed by combining the advantages of an input-output model and an econometric model, is called the Indonesian Econometric Input-Output Model or "Model Input-Output Ekonometrika Indonesia" (MIENA). MIENA consists of 112 dynamic simultaneous equations which utilize secondary data from 19SO-20UO. The equation parameters are estimated by using a combination of three estimation methods: (1) Ordinary Least Squares, (2) First Order of Autoregressive and (3) Second Order of Autoregressive. The model is validated by the Gauss-Siedel Method, it is then used for protections and policy impact analysis simulations on budget development expenditure and world economic conditions.
The study finds that the impact of budget reallocation for development expenditure (final demand, output, income, and sectoral employment) is better than the budget allocation for development expenditure in the National Budgetary Plan (RAP3N) for 2002. The plantation sector contributed the most to supporting output multiplier and high income. The food, beverages, and tobacco industries contributed the most to yield a high employment multiplier."

Economics and Finance in Indonesia, 2004

EFIN-52-3-Des2004-231

Artikel Jurnal Universitas Indonesia Library

Machffud Tra Harana Vova

Klasifikasi Dokumen dan Ekstraksi Lokasi pada Berita Bencana Alam dengan Pendekatan Neural Network dan Pre-Trained Language Model = Document Classification and Location Extraction in Natural Disaster News with Neural Network Approach and Pre-Trained Language Model

"Indonesia merupakan negara yang wilayahnya sering mengalami bencana alam. Salah satu penanganan bencana alam adalah pengumpulan informasi berita bencana seperti artikel atau koran, yang mana berguna untuk meningkatkan readability. Meskipun be- gitu, sekadar pengumpulan artikel saja cukup sulit karena identfikasinya dapat memakan waktu serta makna yang termuat pada berita juga masih perlu diserap. Oleh karena itu perlu dilakukan klasifikasi dokumen untuk memilih teks dokumen yang relevan dengan bencana alam, kemudian dari teks dokumen yang relevan dilakukan ekstraksi informasi. Penelitian mengenai klasifikasi teks bencana alam serta ekstraksi informasi yang sudah dilakukan masih menggunakan pendekatan pemelajaran mesin tradisional serta belum memanfaatkan pre-trained model berbasis bahasa Indonesia. Penggunaan pre-trained model dan pendekatan deep learning sendiri sering memperoleh performa yang lebih baik, sehingga ada kemungkinan performa yang dihasilkan dapat ditingkatkan. Dalam penelitian ini dilakukan eksperimen menggunakan pre-trained word embedding seperti Word2Vec dan fastText, pendekatan deep learning seperti BERT dan BiLSTM untuk task klasifikasi. Hasil dengan pendekatan pemelajaran mesin tradisional dengan BoW yang sudah direproduksi menjadi yang terbaik hampir secara keseluruhan, meskipun jenis classifier yang digunakan adalah MLP yang mana sudah menerapkan deep learning karena memiliki beberapa neuron. Kemudian pada penggunaan model pre-trained seperti BERT, terdapat keterbatasan panjang masukan. Keterbatasan ini dapat ditangani dengan membuat representasi dokumen menjadi lebih pendek menggunakan metode peringkasan teks. Hasil representasi ringkasan dokumen dalam penelitian ini mampu meningkatkan performa akurasi klasifikasi baik pada pendekatan pemelajaran mesin tradisional maupun deep learning. Penelitian ini juga melakukan ekperimen penggunaan pre-trained model yang sudah fine-tuned untuk task ekstraksi lokasi seperti NER dan dependency parsing berbasis bahasa Indonesia, meskipun belum dihasilkan performa yang cukup baik.

Indonesia is a country whose often experiences natural disasters. One way to deal with natural disasters is to collect disaster news information such as articles or newspapers, which are useful for increasing readability. Even so, just collecting articles is quite difficult because identification can take time and the meaning contained in the news still needs to be absorbed. Therefore, it is necessary to classify documents to select document texts that are relevant to natural disasters, then extract information from the relevant document texts. Research on natural disaster text classification and information extraction that has been carried out still uses the traditional machine learning approach and has not yet utilized Indonesian language-based pre-trained models. The use of pre- trained models and deep learning approaches themselves often get better performance, so there is a possibility that the resulting performance can be improved. In this study, experiments were carried out using pre-trained word embedding such as Word2Vec and fastText, deep learning approaches such as BERT and BiLSTM for classification tasks. The results with traditional machine learning approaches with reproducible BoW are the best almost overall, even though the type of classifier used is MLP which already implements deep learning because it has few neurons. Then in the use of pre-trained models such as BERT, there are limitations to the length of the input. This limitation can be overcome by making the document representation shorter using the text summary method. The results of the document summary representation in this study were able to improve the performance of classification accuracy in both traditional and deep learning machine learning approaches. This study also conducted experiments using pre-trained models that had been fine-tuned for location extraction tasks such as NER and Indonesian language-based dependency parsing, although they did not produce sufficiently good performance."

Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2023

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

<< 1 2 3 4 5 6 7 8 9 10 >>

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian