| Abstract [eng] |
This paper examines the challenges of sharing data on cybersecurity threats, an issue that is becoming increasingly relevant as the volume of information technology and interconnected systems grows. The rising number of cyber attacks is driving organizations to exchange information about threats more effectively, but this process is complicated by challenges related to confidentiality, anonymity, and data quality. Organizations often avoid sharing information due to the risk of identification, while inaccurate or unreliable information reduces the overall value of the data set. The objective of this work is to develop a privacy-preserving method for sharing cybersecurity threat intelligence that would enable the automated conversion of data from different formats into a single, standardized format. During the study, existing threat-sharing methods, standards, and anonymization solutions were analyzed, and their applicability was assessed. The analysis revealed that one of the most widely used standards is STIX, which ensures structured and interoperable data representation. Based on this insight, a new method was proposed that allows both unstructured and structured data to be automatically converted into the STIX format. Machine learning models were applied in the implementation of the method, for entity recognition and relationship identification, using a model based on the BERT architecture. Additionally, anonymization techniques are applied to standardized data to remove sensitive information and ensure secure sharing. Experimental results have shown that data structure has a significant impact on processing quality: unstructured text data is more suitable for machine learning methods, while structured data requires additional rule-based solutions. Therefore, a hybrid method is proposed that combines different processing approaches. It was also found that not all anonymization techniques can be applied while maintaining the correctness of the STIX format. The developed method enables more efficient and secure sharing of cyber threat data, contributes to the overall strengthening of cybersecurity, and can be integrated into existing threat-sharing platforms, such as MISP. Research was conducted as part of the execution of Project "Mission-driven Implementation of Science and Innovation Programmes" (No. 02-002-P-0001), funded by the Economic Revitalization and Resilience Enhancement Plan "New Generation Lithuania.". |