Asuransi menjadi konsep yang tidak asing lagi dalam memitigasi risiko yang dapat menimbulkan kerugian finansial yang besar bagi pihak tertanggung. Dalam dunia kerja secara khusus, terlihat adanya peningkatan jumlah kecelakaan kerja di Indonesia dari tahun ke tahun. Kecenderungan tersebut memperlihatkan adanya prospek pengembangan asuransi kompensasi pekerja yang menjanjikan. Tentunya, penentuan tarif premi yang cukup sebagai komponen utama dalam kerangka bisnis asuransi memerlukan prediksi severitas klaim yang akurat. Menurut karakteristik data klaim asuransi pekerja, teramati bahwa dataset tersebut berbentuk tabular dan variabel severitas klaim bersifat kontinu. Oleh sebab itu, prediksi severitas klaim dapat dipandang sebagai masalah regresi data tabular. Penelitian ini akan meninjau performa dari TabTransformer, salah satu metode berbasis tranformer dalam melaksanakan regresi yang mengimplementasikan contextual embeddings terhadap fitur-fitur kategorik. Performa sebagai akibat dari penangkapan konteks oleh model TabTransformer akan diukur dan kemudian dibandingkan dengan metode-metode lain yang mendukung penyelesaian permasalahan regresi, seperti Decision Trees Regressor, Random Forest, XGBoost, dan Multi-Layer Perceptron sebagai model dasar TabTransformer.
It is without the need of doubt to believe upon the integrity within the concepts of insurance as a way of mitigating significant financial risks of its own policyholders. As something which existence is prevalent, risks are also found within the workplace environment as seen in the rising numbers of yearly work-related accidents. This tendency suggests promising prospects upon the development and incorporation of worker’s compensation insurance into the business lines of especially reliable insurance companies. As a core part of insurance policies, determining the sufficient rate of premium would require accurate prediction of claim severity. Upon observing the characteristics of claim severity datasets, witnessed are the following two points: that (1) both datasets take a tabular form, and (2) the number of severities is a continuous target variable. Evidently, it shows that the problem to be solved is regression for tabular data. This particular research will focus upon the performance of TabTransformer as a transformer-based machine learning model that incorporates Transformers in providing a degree of interpretability from its capabilities by performing contextual embeddings of the categorical features of our data. The performance will be measured and will further be compared to other models suitable for regression, such as Decision Trees Regressor, Random Forest, XGBoost, and baseline model Multi-Layer Perceptron