Better Software Patents

Cyber-Threat Score Generation Using Machine Learning

This patent describes a system and method for generating cyber-threat scores by leveraging machine learning (ML) and explicitly considering the quality of the sources providing the threat intelligence. The core innovation lies in training an ML model not only on the presence or absence of threat indicators and their classifications but also on “quality metrics” associated with the sources. During inference, the system identifies votes from various sources on a new indicator, assesses the quality of those sources, and generates a threat score based on the trained ML model, which has learned to weight source reliability. This approach aims to improve the accuracy and reliability of threat intelligence by mitigating the impact of low-quality or unreliable sources.

Main Ideas:

  1. Problem of Varying Source Quality: The patent acknowledges that cyber-threat intelligence originates from diverse sources with varying levels of reliability and accuracy. This heterogeneity can lead to noisy or misleading threat assessments. The invention seeks to address this by incorporating source quality into the scoring process.
  2. Machine Learning for Threat Scoring: The system utilizes machine learning models to analyze cyber-threat indicators and generate a threat score. The ML model is trained on historical data, including indicators, their classifications (e.g., malicious, benign), and quality metrics associated with the sources that reported them.
  3. Incorporating “Quality Metrics” of Sources: A key aspect of the invention is the introduction and utilization of “quality metrics” associated with each data source. These metrics are crucial for the ML model to learn the reliability and trustworthiness of different sources.
  4. Training Phase: During the training phase, the ML model learns the relationship between indicators, their true classifications, and the quality metrics of the reporting sources. This allows the model to understand which sources are more reliable for specific types of indicators.
  5. Inference Phase: When a new cyber-threat indicator is observed, the system gathers votes (classifications) from various sources. It then identifies the “learned quality metrics” associated with these sources (derived from the training phase) and uses the trained ML model to generate a threat score, weighting the votes based on the source quality.
  6. System Architecture: The patent outlines a system (Cyber-Threat Score Generator 310) comprising memory for storing sources, votes, indicators, anchor values, cyber-threat scores, quality metrics, and instructions. It includes a processor, user interface, and transceiver for communication with various sources and analyst terminals.
  7. Types of Quality Metrics: While the patent doesn’t explicitly list all possible quality metrics, it implies they could relate to the historical accuracy, reputation, consistency, or reporting frequency of a source. The ML model learns to associate these initial values with the actual reliability observed during training.
  8. Benefits of the Invention: The described method aims to produce more accurate and reliable cyber-threat intelligence by:
    • Reducing the influence of unreliable or low-quality sources.
    • Providing a more nuanced threat score that reflects the trustworthiness of the contributing information.
    • Potentially improving the efficiency of threat analysis by prioritizing alerts based on scores derived from reliable sources.

Key Quotes:

  • “A cyber-security analysis method uses machine learning (ML) technology to classify cyber-threat indicators, for example, as malicious or benign.” (Abstract, Page 1) – Highlights the use of ML as the core technology.
  • “Embodiments of the invention can employ as machine learning (ML) model initial quality values as parameters of the ML model and can be adjusted based on adjustments of the sources during training of the ML model.” (Page 1) – Emphasizes the integration of source quality into the ML model.
  • “Once the model is trained, the quality metrics (now called learned quality metrics) can be combined into a single probability value where a source’s vote on a classification and its vote is weighted by the learned quality metrics of the source.” (Page 1) – Describes how learned source quality influences the final threat score.
  • “The cyber-threat score can take the form of a probability that the indicator corresponds to a cyber-threat or the form of a probability that the indicator corresponds to the form of the class voted on by the set of sources.” (Page 1) – Explains the potential output format of the generated threat score.

Potential Implications:

  • Improved Threat Intelligence Accuracy: By weighting sources based on their quality, organizations can obtain more reliable threat assessments, leading to better-informed security decisions.
  • Enhanced Efficiency: Focusing on threats identified by high-quality sources can help security teams prioritize alerts and response efforts more effectively.
  • Dynamic Trust Assessment: The ML-based approach allows for a dynamic and adaptive assessment of source reliability, as the model can learn and adjust its weighting of sources over time based on their performance.

Further Considerations:

  • The patent does not explicitly detail how the initial “quality metrics” are defined or obtained. This would likely be a crucial aspect of implementing such a system.
  • The specific types of machine learning models suitable for this task are not specified, leaving room for various implementations.
  • The effectiveness of the system would heavily depend on the availability of sufficient labeled training data that includes information about source quality and the ground truth of cyber-threat indicators.

If this was helpful, you’ll love my mailing list! You'll get software patent drafting tips, helpful checklists, and a 20% discount on my seminars. Join today:

Join 1,088 happy subscribers 🙂

Better Software Patents