Add: sigma rules (#175)

2021-11-22 08:45:44 +09:00
parent b53342218c
commit 034f9c0957
1086 changed files with 40715 additions and 192 deletions
--- a/tools/sigmac/README-English.md
+++ b/tools/sigmac/README-English.md
@@ -13,8 +13,11 @@ Please refer to this documentation to convert rules on your own for local testin

 ## Python requirements

-You need Python 3.8+ and the following modules: `pyyaml`, `ruamel_yaml`, `requests`. 
-You can install the modules with `pip3 install -r requirements.txt`.
+You need Python 3.8+ and the following modules: `pyyaml`, `ruamel.yaml`, `requests`. 
+
+```sh
+pip3 install -r requirements.txt
+```

 ## About Sigma

@@ -36,33 +39,25 @@ Create an environmental variable `$sigma_path` that points to the Sigma reposito
 ```sh
 export sigma_path=/path/to/sigma_repository
 cp hayabusa.py $sigma_path/tools/sigma/backends
+cp convert.sh $sigma_path
+cp splitter.py $sigma_path
 ```

 * Caution：Be sure to specify the path to your Sigma repository in place of `/path/to/sigma_repository`.

-### Converting a single rule
+### Convert Rule

-You can convert a single rule with the following syntax:
+Conversion rules can be created by executing `convert.sh`.
+The rules will be created to hayabusa_rules folder.

 ```sh
-python3 $sigma_path/tools/sigmac <Target Rule> --config <Config File Name> --target hayabusa
+export sigma_path=/path/to/sigma_repository
+cd $sigma_path
+sh convert.sh
 ```

-Example:
-```sh
-python3 $sigma_path/tools/sigmac $sigma_path/rules/windows/create_remote_thread/sysmon_cactustorch.yml --config $sigma_path/tools/config/generic/sysmon.yml --target hayabusa > sysmon_cactustorch.yml
-```
-
-### Converting multiple rules
-
-This example will convert all Sigma rules for Windows event logs to hayabusa rules and save them to the current directory.
-Please run this command from the `./rules/Sigma` directory.
-
-```sh
-find $sigma_path/rules/windows/ -type f -name '*.yml' -exec sh -c 'python3 $sigma_path/tools/sigmac {} --config $sigma_path/tools/config/generic/sysmon.yml --target hayabusa > "$(basename {})"' \;
-```
-
-※  It takes around 30 minutes to convert all rules.
+`sigmac` which we use for convert rule files has many options.
+If you want to use some option, edit `convert.sh`

 ## Currently unsupported rules

@@ -74,21 +69,6 @@ sigma/rules/windows/image_load/sysmon_mimikatz_inmemory_detection.yml
 sigma/rules/windows/process_creation/process_creation_apt_turla_commands_medium.yml
 ```

-Also, the following rules cannot be automatically converted：
-```
-process_creation_apt_turla_commands_medium.yml
-sysmon_mimikatz_inmemory_detection.yml
-win_susp_failed_logons_explicit_credentials.yml
-win_susp_failed_logons_single_process.yml
-win_susp_failed_logons_single_source_kerberos.yml
-win_susp_failed_logons_single_source_kerberos2.yml
-win_susp_failed_logons_single_source_kerberos3.yml
-win_susp_failed_logons_single_source_ntlm.yml
-win_susp_failed_logons_single_source_ntlm2.yml
-win_susp_failed_remote_logons_single_source.yml
-win_susp_samr_pwset.yml
-```
-
 ## Sigma rule parsing errors

-Some rules will have been able to be converted but will cause parsing errors. We will continue to fix these bugs but for the meantime the majority of Sigma rules do work so please ignore the errors for now.
+Some rules will have been able to be converted but will cause parsing errors. We will continue to fix these bugs but for the meantime the majority of Sigma rules do work so please ignore the errors for now.
--- a/tools/sigmac/README-Japanese.md
+++ b/tools/sigmac/README-Japanese.md
@@ -14,8 +14,12 @@ Sigmaからhayabusa形式に変換されたルールが`./rules/Sigma`ディレ

 ## Pythonの環境依存

-Python 3.8以上と次のモジュールが必要です：`pyyaml`、`ruamel_yaml`、`requests` 
-`pip3 install -r requirements.txt`というコマンドでインストールできます。
+Python 3.8以上と次のモジュールが必要です：`pyyaml`、`ruamel.yaml`、`requests` 
+以下のコマンドでインストール可能です。
+
+```sh
+pip3 install -r requirements.txt
+```

 ## Sigmaについて

@@ -37,32 +41,22 @@ Sigmaレポジトリのパスが書いてある`$sigma_path`という環境変
 ```sh
 export sigma_path=/path/to/sigma_repository
 cp hayabusa.py $sigma_path/tools/sigma/backends
+cp convert.sh $sigma_path
+cp splitter.py $sigma_path
 ```

 * 注意：`/path/to/sigma_repository`そのままではなくて、自分のSigmaレポジトリのパスを指定してください。

-### 単体のルールを変換
-
-以下のシンタクスで単体のルールを変換できます：
+### ルールの変換
+convert.shを実行することでルールの変換が実行されます。変換されたルールはhayabusa_rulesフォルダに作成されます。

 ```sh
-python3 $sigma_path/tools/sigmac <変換対象ruleの指定> --config <configの指定> --target hayabusa
+export sigma_path=/path/to/sigma_repository
+cd $sigma_path
+sh convert.sh
 ```

-例：
-```sh
-python3 $sigma_path/tools/sigmac $sigma_path/rules/windows/create_remote_thread/sysmon_cactustorch.yml --config $sigma_path/tools/config/generic/sysmon.yml --target hayabusa > sysmon_cactustorch.yml
-```
-
-### 複数のルールを変換
-
-以下のように、SigmaのすべてのWindowsイベントログルールをhayabusaルールに変換して、カレントディレクトリに保存します。`./rules/Sigma`ディレクトリから実行して下さい。
-
-```sh
-find $sigma_path/rules/windows/ -type f -name '*.yml' -exec sh -c 'python3 $sigma_path/tools/sigmac {} --config $sigma_path/tools/config/generic/sysmon.yml --target hayabusa > "$(basename {})"' \;
-```
-
-※ すべてのルールを変換するのに、約30分かかります。
+ルールの変換に利用しているsigmacには様々なオプションが用意されています。オプションを変更する場合はconvert.shを編集してください。

 ## 現在サポートされていないルール

@@ -74,21 +68,6 @@ sigma/rules/windows/image_load/sysmon_mimikatz_inmemory_detection.yml
 sigma/rules/windows/process_creation/process_creation_apt_turla_commands_medium.yml
 ```

-また、以下のルールも現在変換できません：
-```
-process_creation_apt_turla_commands_medium.yml
-sysmon_mimikatz_inmemory_detection.yml
-win_susp_failed_logons_explicit_credentials.yml
-win_susp_failed_logons_single_process.yml
-win_susp_failed_logons_single_source_kerberos.yml
-win_susp_failed_logons_single_source_kerberos2.yml
-win_susp_failed_logons_single_source_kerberos3.yml
-win_susp_failed_logons_single_source_ntlm.yml
-win_susp_failed_logons_single_source_ntlm2.yml
-win_susp_failed_remote_logons_single_source.yml
-win_susp_samr_pwset.yml
-```
-
 ## Sigmaルールのパースエラーについて

 一部のルールは変換できたものの、パースエラーが発生しています。
--- a/tools/sigmac/README-hayabusa-sigma-backend.md
+++ b/tools/sigmac/README-hayabusa-sigma-backend.md
--- a/tools/sigmac/convert.sh
+++ b/tools/sigmac/convert.sh
@@ -0,0 +1,3 @@
+rm -rf hayabusa_rules
+python ./tools/sigmac -t hayabusa --config ./tools/config/generic/sysmon.yml --defer-abort -r rules/windows/ > sigma_to_hayabusa.yml
+python splitter.py
--- a/tools/sigmac/hayabusa.py
+++ b/tools/sigmac/hayabusa.py
@@ -1,20 +1,22 @@
 import copy
 from collections import OrderedDict
 from io import StringIO
-
 import yaml
+import re
+
 from sigma.backends.base import SingleTextQueryBackend
-from sigma.parser.condition import SigmaAggregationParser
+from sigma.parser.condition import SigmaAggregationParser, ConditionOR, ConditionAND
 from sigma.parser.modifiers.base import SigmaTypeModifier
 from sigma.parser.modifiers.type import SigmaRegularExpressionModifier

+SPECIAL_REGEX = re.compile("^\{(\d)+,?(\d+)?\}")
+
 class HayabusaBackend(SingleTextQueryBackend):
    """Base class for backends that generate one text-based expression from a Sigma rule"""
    ## see tools.py
    ## use this value when sigmac parse argument of "-t"
    identifier = "hayabusa"
    active = True
-
    # the following class variables define the generation and behavior of queries from a parse tree some are prefilled with default values that are quite usual
    andToken = " and "                  # Token used for linking expressions with logical AND
    orToken = " or "                    # Same for OR
@@ -22,31 +24,33 @@ class HayabusaBackend(SingleTextQueryBackend):
    subExpression = "(%s)"              # Syntax for subexpressions, usually parenthesis around it. %s is inner expression
    valueExpression = "%s"              # Expression of values, %s represents value
    typedValueExpression = dict()       # Expression of typed values generated by type modifiers. modifier identifier -> expression dict, %s represents value
-
    sort_condition_lists = False
    mapListsSpecialHandling = True
-
    name_idx = 1
    selection_prefix = "SELECTION_{0}"
    name_2_selection = OrderedDict()
-
+    
    def __init__(self, sigmaconfig, options):
        super().__init__(sigmaconfig)
-
+        self.re_init()
+        
+    def re_init(self):
+        self.name_idx = 1
+        self.name_2_selection = OrderedDict()
+    
    def cleanValue(self, val):
        return val
-
+    
    def generateListNode(self, node):
        return self.generateORNode(node)
-
+    
    def create_new_selection(self):
        name = self.selection_prefix.format(self.name_idx)
        self.name_idx+=1
        return name
-
+    
    def generateMapItemNode(self, node):
        fieldname, value = node
-
        transformed_fieldname = self.fieldNameMapping(fieldname, value)
        if self.mapListsSpecialHandling == False and type(value) in (str, int, list) or self.mapListsSpecialHandling == True and type(value) in (str, int):
            name = self.create_new_selection()
@@ -60,12 +64,54 @@ class HayabusaBackend(SingleTextQueryBackend):
            return self.generateNode((transformed_fieldname+"|re","^$")) #nullは正規表現で表す。これでいいのかちょっと不安
        else:
            raise TypeError("Backend does not support map values of type " + str(type(value)))
-
+        
    def generateMapItemTypedNode(self, fieldname, value):
        # `|re`オプションに対応
        if type(value) == SigmaRegularExpressionModifier:
            fieldname = fieldname + "|re"
-            return self.generateNode((fieldname,value.value))
+
+            # pythonとかの正規表現では/(スラッシュ)や"(ダブルクオート)をエスケープしてもエラーが出ないが、Rustの正規表現エンジンではスラッシュやダブルクオートをエスケープするとエラーが出てしまう
+            # そこでスラッシュやダブルクオートのエスケープは消しておく。
+            # あと、この実装は結構怪しいので、将来バージョンではこの実装を無くして、hayabusa側で使用する正規表現エンジンを普通のpythonとかで使われているやつに変えた方がいいと思う。
+            regex_value = value.value.replace('\/','/')
+            regex_value = regex_value.replace("\\\"","\"")
+            
+            ## 追加のケースとして、pythonとかの正規表現では{はエスケープ不要だが、Rustでは必要なので、それを修正するためのコード。めんどい
+            idx = 0
+            prev_regex = regex_value
+            regex_value = ""
+            while idx < len(prev_regex):
+                ## 既にエスケープされているものはスキップする。
+                if prev_regex[idx:idx+2] == "\\{" or prev_regex[idx:idx+2] == "\\}":
+                    regex_value += prev_regex[idx:idx+2]
+                    idx += 2
+                    continue
+                
+                ch = prev_regex[idx]
+                ## エスケープ不要な}はここに来ないように、以降の処理でidxを調整している。なのでここにくる}はエスケープが必要。
+                if ch == "}":
+                    regex_value += "\\}"
+                    idx += 1
+                    continue
+                
+                ## {じゃない場合はそのまま足すだけ
+                if ch != "{":
+                    regex_value += ch
+                    idx += 1
+                    continue
+                
+                ## {の場合の処理
+                reg_match = SPECIAL_REGEX.match(prev_regex[idx:])
+                if reg_match == None:
+                    ## 文字列としての{なので、エスケープ必要
+                    regex_value += "\\{"
+                    idx += 1
+                else:
+                    ## これは桁数を指定する{なので、エスケープ不要で}までidxをスキップ
+                    regex_value += reg_match.group()
+                    idx += len(reg_match.group())
+
+            return self.generateNode((fieldname, regex_value))
        else:
            raise NotImplementedError("Type modifier '{}' is not supported by backend".format(value.identifier))

@@ -75,20 +121,28 @@ class HayabusaBackend(SingleTextQueryBackend):
        ###     EventID:
        ###         - 1
        ###         - 2
-
        ### 基本的にリストはORと良く、generateListNodeもORNodeを生成している。
        ### しかし、上記のケースでgenerateListNode()を実行すると、下記のようなYAMLになってしまう。
        ### selection:
        ###     EventID: 1 or 2
-
        ### 上記のようにならないように、修正している。
        ### なお、generateMapItemListNode()を有効にするために、self.mapListsSpecialHandling = Trueとしている
+        if self._is_all_str(value):
+            name = self.create_new_selection()
+            self.name_2_selection[name] = [(fieldname,value)]
+            return name
+
        list_values = list()
        for sub_node in value:
            list_values.append((fieldname,sub_node))
-
        return self.subExpression % self.generateORNode(list_values) 
-
+    
+    def _is_all_str(self, values):
+        for value in values:
+            if type(value) != str:
+                return False
+        return True
+    
    def generateAggregation(self, agg):
        # python3 tools/sigmac rules/windows/process_creation/win_dnscat2_powershell_implementation.yml --config tools/config/generic/sysmon.yml --target hayabusa
        if agg == None:
@@ -97,26 +151,106 @@ class HayabusaBackend(SingleTextQueryBackend):
            # condition の中に "|" は1つのみ
            # | 以降をそのまま出力する
            target = '|'
-            index = agg.parser.parsedyaml["detection"]["condition"].find(target)
-            return agg.parser.parsedyaml["detection"]["condition"][index:]
-
+            condition = agg.parser.parsedyaml["detection"]["condition"]
+            
+            ### conditionはなんと複数指定されることもあるらしい!!!!!
+            ### If multiple conditions are given, they are logically linked with OR.と仕様書に書いてある。詳細はSigmaRuleの仕様を参照のこと。
+            ### とりあえず、複数指定のconditionは未対応ということでエラーにするとして、(なお、デフォルトのbase.pyの実装で複数指定のconditionはexceptionがraiseされるので、そのような処理は追加で実装しなくてよい)
+            ### 問題となるのはagg.parser.parsedyaml["detection"]["condition"]の型
+            ###
+            ### 下記のように指定すると、agg.parser.parsedyaml["detection"]["condition"]の型はstringになるが
+            ### conditon: selection1
+            ###
+            ### 下記のように指定すると、agg.parser.parsedyaml["detection"]["condition"]の型はlistになる
+            ### conditon: 
+            ###  - selection1
+            ###
+            ### なのでlistのケースも想定して、下記のような実装とする。
+            if type(condition) == list: 
+                condition = condition[0]
+            index = condition.find(target)
+            return condition[index:]
        ## count以外は対応していないので、エラーを返す
        raise NotImplementedError("This rule contains aggregation operator not implemented for this backend")
-
+    
    def generateValueNode(self, node):
        ## このメソッドをオーバーライドしておかないとint型もstr型として扱われてしまうので、int型やint型として、str型はstr型として処理するために実装した。
        ## このメソッドは最悪無くてもいいような気もする。
-
        if type(node) == int:
            return node
        else:
            return self.valueExpression % (self.cleanValue(str(node)))
+    
+    ### 全部strかどうかを判定
+    def is_keyword_list(self, node ):
+        if type(node) != ConditionOR:
+            return False
+        
+        for item in node.items:
+            if type(item) != str:
+                return False
+        
+        return True
+    
+    def generateANDNode(self, node):        
+        generated = list()
+        for val in node:
+            if type(val) == str:
+                ### 普通はtupleでkeyとvalueのペアであるが、これはkeyが指定されていないケース
+                ### keyが指定されていない場合は、EventLog全体をgrep検索することになっている。(詳細はSigmaルールの仕様書を参照のこと)
+                ### 具体的には"all of"とか使うとこの分岐に来る
+                name = self.create_new_selection()
+                self.name_2_selection[name] = [(None, val)]
+                generated_node = name
+            else:
+                ### 普通はこっちにくる
+                generated_node = self.generateNode(val)
+            generated.append(generated_node)
+        filtered = [ g for g in generated if g is not None ]
+        if filtered:
+            if self.sort_condition_lists:
+                filtered = sorted(filtered)
+            return self.andToken.join(filtered)
+        else:
+            return None
+        
+    def generateORNode(self, node):
+        if self.is_keyword_list(node) == True:
+            ## 普通はtupleでkeyとvalueのペアであるが、これはkeyが指定されていないケース
+            ## 全てkeyが指定されていない場合はここに来る。
+            name = self.create_new_selection()
+            self.name_2_selection[name] = [(None, val) for val in node]
+            return name
+        
+        name = None
+        generated = list()
+        for val in node:
+            ### 普通はtupleでkeyとvalueのペアであるが、これはkeyが指定されていないケース
+            if type(val) == str:
+                if name is None:
+                    name = self.create_new_selection()
+                    self.name_2_selection[name] = list()
+                self.name_2_selection[name].append((None,val))
+            else:
+                generated.append(self.generateNode(val))
+        if name is not None:
+            generated.append(name)

+        filtered = [ g for g in generated if g is not None ]
+        if filtered:
+            if self.sort_condition_lists:
+                filtered = sorted(filtered)
+            return self.orToken.join(filtered)
+        else:
+            return None
+        
    def generateQuery(self, parsed):
+        ### このクラスのインスタンスは再利用されるので、内部のメンバ変数をresetする。
+        self.re_init()
        result = self.generateNode(parsed.parsedSearch)
        if parsed.parsedAgg:
            res = self.generateAggregation(parsed.parsedAgg)
-            result += res
+            result += " " + res
        ret = ""
        with StringIO() as bs:
            ## 元のyamlをいじるとこの後の処理に影響を与える可能性があるので、deepCopyする
@@ -131,11 +265,23 @@ class HayabusaBackend(SingleTextQueryBackend):
            parsed_yaml["detection"] = {}
            parsed_yaml["detection"]["condition"] = result
            for key, values in self.name_2_selection.items():
-                parsed_yaml["detection"][key] = {}
+                ### fieldnameの有無を確認している
+                if values[0][0]:
+                    ## 通常はfieldnameがあってその場合は連想配列で初期化
+                    parsed_yaml["detection"][key] = {}
+                else:
+                    ## is_keyword_list() == Trueの場合だけ、ここにくる
+                    parsed_yaml["detection"][key] = []
+                    
                for fieldname, value in values:
-                    parsed_yaml["detection"][key][fieldname] = value
-
+                    if fieldname == None:
+                        ## is_keyword_list() == Trueの場合
+                        parsed_yaml["detection"][key].append(value)
+                    else:
+                        ## is_keyword_list() == Falseの場合
+                        parsed_yaml["detection"][key][fieldname] = value
            yaml.dump(parsed_yaml, bs, indent=4, default_flow_style=False)
            ret = bs.getvalue()
-
-        return ret
+            ret += "---\n"
+        
+        return ret
--- a/tools/sigmac/requirements.txt
+++ b/tools/sigmac/requirements.txt
@@ -1,3 +1,3 @@
 pyyaml
-ruamel_yaml
+ruamel.yaml
 requests
--- a/tools/sigmac/splitter.py
+++ b/tools/sigmac/splitter.py
@@ -0,0 +1,38 @@
+## pip install pyyaml
+
+import os
+import ruamel.yaml
+
+yaml = ruamel.yaml.YAML()
+
+
+def load_ymls( filepath ):
+    with open(filepath) as f:
+        return list(yaml.load_all(f))
+
+def dump_yml( filepath, data ):
+    with open(filepath, "w") as stream:
+        yaml.dump(data, stream )
+
+def main():
+    loaded_ymls = load_ymls("sigma_to_hayabusa.yml")
+    for loaded_yml in loaded_ymls:
+        if loaded_yml == None:
+            continue
+
+        if loaded_yml["yml_path"] == None or len(loaded_yml["yml_path"]) == 0:
+            continue
+
+        out_dir = "hayabusa_rules/" + loaded_yml["yml_path"]
+        out_path = out_dir + "/" + loaded_yml["yml_filename"]
+
+        if not os.path.exists(out_dir):
+            os.makedirs(out_dir)
+
+        loaded_yml.pop("yml_path")
+        loaded_yml.pop("yml_filename")
+
+        dump_yml(out_path,loaded_yml)
+
+if __name__ == "__main__":
+    main()