How to introduce Semgrep to your organization

Semgrep, a static analysis tool for finding bugs and specific code patterns in more than 30 languages, is set apart by its ease of use, many built-in rules, and the ability to easily create custom rules. We consider it an essential automated tool for discovering security issues in a codebase. Since Semgrep can directly improve your code’s security, it’s easy to say, “Just use it!” But what does that mean?
Semgrep 是一种静态分析工具,用于查找 30 多种语言的错误和特定代码模式,其特点是易于使用、许多内置规则以及轻松创建自定义规则的能力。我们认为它是发现代码库中安全问题的重要自动化工具。由于 Semgrep 可以直接提高代码的安全性,因此很容易说,“使用它就好了!但这意味着什么呢?

Semgrep is designed to be flexible to fit your organization’s specific needs. To get the best results, it’s important to understand how to run Semgrep, which rules to use, and how to integrate it into the CI/CD pipeline. If you are unsure how to get started, here is our seven-step plan to determine how to best integrate Semgrep into your SDLC, based on what we’ve learned over the years.
Semgrep 旨在灵活地满足您组织的特定需求。为了获得最佳结果,请务必了解如何运行 Semgrep、使用哪些规则以及如何将其集成到 CI/CD 管道中。如果您不确定如何开始,以下是我们的七步计划,根据我们多年来学到的知识,确定如何最好地将 Semgrep 集成到您的 SDLC 中。

The 7-step Semgrep plan
7 步 Semgrep 计划

    1. Review the list of supported languages to understand whether Semgrep can help you.
      查看支持的语言列表,了解 Semgrep 是否可以为您提供帮助。

    1. Explore: Try Semgrep on a small project to evaluate its effectiveness. For example, navigate into the root directory of a project and run:
      探索:在一个小项目上试用 Semgrep 以评估其有效性。例如,导航到项目的根目录并运行:

      $ semgrep --config auto

      There are a few important notes to consider when running this command:
      运行此命令时,需要考虑一些重要注意事项:

      • The --config auto option submits metrics to Semgrep, which may not be desirable.
        该 --config auto 选项将指标提交给 Semgrep,这可能并不可取。
      • Invoking Semgrep in this way will present an overview of identified issues, including the number and severity. In general, you can use this CLI flag to gain a broad view of the technologies covered by Semgrep.
        以这种方式调用 Semgrep 将显示已识别问题的概述,包括数量和严重性。通常,您可以使用此 CLI 标志来大致了解 Semgrep 涵盖的技术。
      • Semgrep identifies programming languages by file extensions rather than analyzing their contents. Some paths are excluded from scanning by default using the default .semgrepignore file. Additionally, Semgrep excludes untracked files listed in a .gitignore file.
        Semgrep 通过文件扩展名来识别编程语言,而不是分析其内容。默认情况下,使用默认 .semgrepignore 文件从扫描中排除某些路径。此外,Semgrep 会排除 .gitignore 文件中列出的未跟踪文件。

    1. Dive deep: Instead of using the auto option, use the Semgrep Registry to select rulesets based on key security patterns, and your tech stack and needs.
      深入了解:不要使用该 auto 选项,而是使用 Semgrep 注册表根据关键安全模式以及您的技术堆栈和需求选择规则集。

        • Try: 尝试:

      $ semgrep --config p/default
      $ semgrep --config p/owasp-top-ten
      $ semgrep --config p/cwe-top-25

          • or choose a ruleset based on your technology:


      或者根据您的技术选择规则集:

      $ semgrep --config p/javascript

      • Focus on rules with high confidence and medium- or high-impact metadata first. If there are too many results, limit results to error severity only using the --severity ERROR flag.
        首先关注具有高置信度的规则和中等或高影响的元数据。如果结果过多,则仅使用标志 --severity ERROR 将结果限制为错误严重性。
      • Resolve identified issues and include reproduction instructions in your bug reports.
        解决已发现的问题,并在错误报告中包含重现说明。

    1. Fine-tune: Obtain your ideal rulesets chain by reviewing the effectiveness of currently used rulesets.
      微调:通过查看当前使用的规则集的有效性来获得理想的规则集链。

      • Check out non-security rulesets, too, such as best practices rules. This will enhance code readability and may prevent the introduction of vulnerabilities in the future. Also, consider covering other aspects of your project:
        还要查看非安全规则集,例如最佳实践规则。这将增强代码的可读性,并可能防止将来引入漏洞。此外,请考虑涵盖项目的其他方面:

        • Shell scripts, configuration files, generic files, Dockerfiles
          Shell 脚本、配置文件、通用文件、Dockerfile
        • Third-party dependencies (Semgrep Supply Chain, a paid feature, can help you detect if you are using the vulnerable package in an exploitable way)
          第三方依赖项(Semgrep Supply Chain 是一项付费功能,可以帮助您检测是否以可利用的方式使用易受攻击的软件包)
      • To ignore the incorrect code pattern by Semgrep, use a comment in your code on the first line of a preceding line of the pattern match, e.g., // nosemgrep: go.lang.security.audit.xss. Also, explain why you decided to disable a rule or provide a risk-acceptance reason.
        要忽略 Semgrep 不正确的 Code Pattern,请在代码中对模式匹配的前一行的第一行使用注释,例如 // nosemgrep: go.lang.security.audit.xss .此外,请解释您决定禁用规则的原因或提供风险接受原因。
      • Create a customized .semgrepignore file to reduce noise by excluding specific files or folders from the Semgrep scan. Semgrep ignores files listed in .gitignore by default. To maintain this, after creating a .semgrepignore file, add .gitignore to your .semgrepignore with the pattern :include .gitignore.
        创建一个自定义 .semgrepignore 文件,通过从 Semgrep 扫描中排除特定文件或文件夹来减少噪音。默认情况下,Semgrep 会忽略 .gitignore 中列出的文件。要保持这一点,请在创建 .semgrepignore 文件后,使用模式 :include .gitignore 添加到 .gitignore 您的 .semgrepignore .

    1. Create an internal repository to aggregate custom Semgrep rules specific to your organization. A README file should include a short tutorial on using Semgrep, applying custom rules from your repository, and an inventory table of custom rules. Also, a contribution checklist will allow your team to maintain the quality level of the rules (see the Trail of Bits Semgrep rule development checklist). Ensure that adding a new Semgrep rule to your internal Semgrep repository includes a peer review process to reduce false positives/negatives.
      创建内部存储库以聚合特定于组织的自定义 Semgrep 规则。README 文件应包含有关使用 Semgrep、应用存储库中的自定义规则以及自定义规则的清单表的简短教程。此外,贡献清单将允许你的团队保持规则的质量水平(请参阅 Trail of Bits Semgrep 规则开发清单)。确保将新的 Semgrep 规则添加到内部 Semgrep 存储库包括同行评审过程,以减少误报/误报。

    1. Evangelize: Train developers and other relevant teams on effectively using Semgrep.
      宣传:培训开发人员和其他相关团队如何有效地使用 Semgrep。

      • Present pilot test results and advice on improving the organization’s code quality and security. Show potential Semgrep limitations (single-file analysis only).
        提供试点测试结果和有关提高组织代码质量和安全性的建议。显示潜在的 Semgrep 限制(仅限单文件分析)。
      • Include the official Learn Semgrep resource and present the Semgrep Playground with “simple mode” for easy rule creation.
        包括官方的 Learn Semgrep 资源,并以“简单模式”呈现 Semgrep Playground,以便于创建规则。
      • Provide an overview of how to write custom rules and emphasize that writing custom Semgrep rules is easy. Mention that the custom rules can be extended with the auto-fix feature using the fix: key. Encourage using metadata (i.e., CWE, confidence, likelihood, impact) in custom rules to support the vulnerability management process.
        概述如何编写自定义规则,并强调编写自定义 Semgrep 规则很容易。提及可以使用 fix: 密钥通过自动修复功能扩展自定义规则。鼓励在自定义规则中使用元数据(即 CWE、置信度、可能性、影响)来支持漏洞管理流程。
      • To help a developer answer the question, “Should I create a Semgrep rule for this problem?” you can use these follow-up questions:
        为了帮助开发人员回答“我是否应该为此问题创建 Semgrep 规则”这个问题,您可以使用以下后续问题:

        • Can we detect a specific security vulnerability?
          我们能否检测到特定的安全漏洞?
        • Can we enforce best practices/conventions or maintain code consistency?
          我们能否强制执行最佳实践/约定或保持代码一致性?
        • Can we optimize the code by detecting code patterns that affect performance?
          我们能否通过检测影响性能的代码模式来优化代码?
        • Can we validate a specific business requirement or constraint?
          我们能否验证特定的业务需求或约束?
        • Can we identify deprecated/unused code?
          我们能识别出已弃用/未使用的代码吗?
        • Can we spot any misconfiguration in a configuration file?
          我们能发现配置文件中的任何错误配置吗?
        • Is this a recurring question as you review your code?
          在查看代码时,这是一个反复出现的问题吗?
        • How is code documentation handled, and what are the requirements for documentation?
          代码文档是如何处理的,对文档有什么要求?
      • Create places for the team to discuss Semgrep, write custom rules, troubleshoot (e.g., a Slack channel), and jot down ideas for Semgrep rules (e.g., on a Trello board). Also, consider writing custom rules for bugs found during your organization’s security audits/bug bounty program. A good idea is to aggregate quick notes to help your team use Semgrep (see the appendix below).
        为团队创建讨论 Semgrep、编写自定义规则、故障排除(例如,Slack 频道)和记下 Semgrep 规则的想法(例如,在 Trello 看板上)的地方。此外,请考虑为在组织的安全审核/漏洞赏金计划期间发现的 bug 编写自定义规则。一个好主意是汇总快速笔记,以帮助您的团队使用 Semgrep(请参阅下面的附录)。
      • Pay attention to the Semgrep Community Slack, where the Semgrep community helps with problems or writing custom rules.
        关注 Semgrep 社区 Slack,Semgrep 社区帮助解决问题或编写自定义规则。
      • Encourage the team to report existing limitations/bugs while using Semgrep to the Semgrep team by filling out GitHub issues (see this example issue submitted by Trail of Bits).
        鼓励团队在使用 Semgrep 时通过填写 GitHub 问题向 Semgrep 团队报告现有的限制/错误(请参阅 Trail of Bits 提交的此示例问题)。

  1. Implement Semgrep in the CI/CD pipeline by getting acquainted with the Semgrep documentation related to your CI vendor. Incorporating Semgrep incrementally is important to avoid overwhelming developers with too many results. So, try out a pilot test first on a repository. Then, implement the full Semgrep scan on a schedule on the main branch in the CI/CD pipeline. Finally, include a diff-aware scanning approach when an event triggers (e.g., a pull/merge request). A diff-aware approach scans only changes in files on a trigger, maintaining efficiency. This approach should examine a fine-tuned set of rules that provide high confidence and true positive results. Once the Semgrep implementation is mature, configure Semgrep in the CI/CD pipeline to block the PR pipeline with unresolved Semgrep findings.
    通过熟悉与 CI 供应商相关的 Semgrep 文档,在 CI/CD 管道中实现 Semgrep。以增量方式合并 Semgrep 对于避免开发人员因过多结果而不知所措非常重要。因此,请先在存储库上尝试试点测试。然后,在 CI/CD 管道的主分支上按计划实现完整的 Semgrep 扫描。最后,在事件触发时包括差异感知扫描方法(例如,拉取/合并请求)。差异感知方法仅在触发器上扫描文件中的更改,从而保持效率。这种方法应该检查一组微调的规则,以提供高置信度和真正的阳性结果。Semgrep 实现成熟后,在 CI/CD 管道中配置 Semgrep,以阻止具有未解决的 Semgrep 结果的 PR 管道。

What’s next? Maximizing the value of Semgrep in your organization
下一步是什么?最大化 Semgrep 在您的组织中的价值

As you introduce Semgrep to your organization, remember that it undergoes frequent updates. To make the most of its benefits, assign one person in your organization to be responsible for analyzing new features (e.g., Semgrep Pro, which extends codebase scanning with inter-file coding paradigms instead of Semgrep’s single-file approach), informing the team about external repositories of Semgrep rules, and determining the value of the paid subscription (e.g., access to premium rules).
当您将 Semgrep 引入您的组织时,请记住它会经常更新。为了充分利用其优势,请在组织中指定一个人负责分析新功能(例如,Semgrep Pro,它使用文件间编码范式而不是 Semgrep 的单文件方法扩展代码库扫描),通知团队有关 Semgrep 规则的外部存储库的信息,并确定付费订阅的价值(例如,访问高级规则)。

Furthermore, use the Trail of Bits Testing Handbook, a concise guide that helps developers and security professionals maximize the potential of static and dynamic analysis tools. The first chapter of this handbook focuses specifically on Semgrep. Check it out to learn more!
此外,还可以使用《比特测试跟踪手册》,这是一份简明的指南,可帮助开发人员和安全专业人员最大限度地发挥静态和动态分析工具的潜力。本手册的第一章特别关注 Semgrep。查看以了解更多信息!

Appendix: Things I wish I’d known before I started using Semgrep
附录:我希望在开始使用 Semgrep 之前就知道的事情

Using Semgrep 使用 Semgrep

  • Use the --sarif output flag with the Sarif Viewer extension in Visual Studio Code to efficiently navigate through the identified code.
    在 Visual Studio Code 中将 --sarif 输出标志与 Sarif 查看器扩展一起使用,以有效地浏览标识的代码。
  • The --config auto option may miss some vulnerabilities. Manual language selection (--lang) and rulesets can be more effective.
    --config auto 选项可能会遗漏一些漏洞。手动选择语言 ( --lang ) 和规则集可以更有效。
  • You can use the alias: alias semgrep="semgrep --metrics=off" or SEMGREP_SEND_METRICS environment variable to remember to disable metrics.
    您可以使用 alias: alias semgrep="semgrep --metrics=off" 或 SEMGREP_SEND_METRICS 环境变量来记住禁用指标。
  • Use the ephemeral rules, e.g., semgrep -e ‘exec(...)’ —lang=py ./, to quickly use Semgrep in the style of the grep tool.
    使用临时规则,例如 semgrep -e ‘exec(...)’ —lang=py ./ ,以 grep 工具的样式快速使用 Semgrep。
  • You can use the autocomplete feature to use the TAB key to work faster with the command line.
    您可以使用自动完成功能来使用 Tab 键以更快地使用命令行。
  • You can run several predefined configurations simultaneously: semgrep --config p/cwe-top-25 --config p/jwt.
    您可以同时运行多个预定义的配置: semgrep --config p/cwe-top-25 --config p/jwt 。
  • Semgrep Pro Engine feature removes Semgrep’s limitations in analyzing only single files.
    Semgrep Pro Engine 功能消除了 Semgrep 仅分析单个文件的限制。
  • Rules from the Semgrep Registry can be tested in a playground (see Trail of Bits anonymous-race-condition rule).
    Semgrep 注册表中的规则可以在 Playground 中进行测试(请参阅 Trail of Bits anonymous-race-condition rule)。
  • Metavariable analysis supports two analyzers: redos and entropy.
    元变量分析支持两种分析器:redos 和 entropy。
  • You can use metavariable-pattern to match patterns across different languages within a single file (e.g., JavaScript embedded in HTML).
    您可以使用在单个文件中匹配不同语言的模式(例如,嵌入 metavariable-pattern 在 HTML 中的 JavaScript)。
  • The focus-metavariable can reduce false positives in taint mode.
    focus-metavariable 可以减少污点模式下的误报。

Writing rules 编写规则

  • Metavariables must be capitalized: $A, not $a
    元变量必须大写: $A ,而不是 $a
  • Use pattern-regex: (?s)\A.*\Z pattern to identify a file that does not contain a specific string (see example)
    使用 pattern-regex: (?s)\A.*\Z pattern 标识不包含特定字符串的文件(请参阅示例)
  • When writing a regular expression in multiple lines, use the >- characters, not |. The | character writes a newline character (\n) and will likely cause the regex to fail (see example)
    在多行中编写正则表达式时,请使用 >- 字符,而不是 |。这 |character 写入换行符 (\n),可能会导致正则表达式失败(请参阅示例)
  • You can use typed metavariables, e.g., $X == (String $Y)
    您可以使用类型化元变量,例如, $X == (String $Y)
  • Semgrep supports variable assignment statements in the following way:
    Semgrep 通过以下方式支持变量赋值语句:
  • You can use the method chaining:
    您可以使用方法链接:
  • The Deep Expression Operator matches complex, nested expressions using the syntax
    Deep Expression Operator 使用以下语法匹配复杂的嵌套表达式

    <... pattern ...>
  • It is possible to apply specific rules to specific paths using the paths keyword (see the avoid-apt-get-upgrade rule, which applies only to Dockerfiles):
    可以使用 paths 关键字将特定规则应用于特定路径(请参阅 avoid-apt-get-upgrade 仅适用于 Dockerfile 的规则):
    paths:
      include:
        - "*dockerfile*"
        - "*Dockerfile*"
  • And last, Trail of Bits has a public Semgrep rules repository! Check it out here and use it immediately with the semgrep --config p/trailofbits command.
    最后,Trail of Bits 有一个公共的 Semgrep 规则存储库!在此处查看并立即使用 semgrep --config p/trailofbits 命令。

Useful links 相关链接

For more on creating custom rules, read our blogs on machine learning libraries and discovering goroutine leaks.
有关创建自定义规则的更多信息,请阅读我们关于机器学习库和发现 goroutine 泄漏的博客。

We’ve compiled a list of additional resources to further assist you in your Semgrep adoption process. These links provide a variety of perspectives and detailed information about the tool, its applications, and the community that supports it:
我们编制了一份其他资源列表,以进一步帮助您完成 Semgrep 采用过程。这些链接提供了有关该工具、其应用程序和支持它的社区的各种观点和详细信息:

原文始发于Maciej DomanskiHow to introduce Semgrep to your organization

版权声明:admin 发表于 2024年1月15日 下午9:51。
转载请注明:How to introduce Semgrep to your organization | CTF导航

相关文章

暂无评论

您必须登录才能参与评论!
立即登录
暂无评论...