Skip to content

Latest commit

 

History

History

ReadMe.md

Jarvis artifact

This is the readme file of Jarvis artifact.

Dataset and Ground truth

The micro-benchmark and macro-benchmark is provide in dataset and ground_truth directory.

Getting Jarvis to run

Prerequisites:

  • Python = 3.8
  • PyCG: tool/PyCG
  • Jarvis: tool/Jarvis

run jarvis_cli.py.

Jarvis usage:

$ python3 tool/Jarvis/jarvis_cli.py [module_path1 module_path2 module_path3...] [--package] [--decy] [-o output_path]

Jarvis help:

$ python3 tool/Jarvis/jarvis_cli.py -h
  usage: jarvis_cli.py [-h] [--package PACKAGE] [--decy] [--precision]
                       [--moduleEntry [MODULEENTRY ...]]
                       [--operation {call-graph,key-error}] [-o OUTPUT]
                       [module ...]

  positional arguments:
    module                modules to be processed, which are also 'Demands' in D.W. mode 

  options:
    -h, --help            show this help message and exit
    --package PACKAGE     Package containing the code to be analyzed
    --decy                whether analyze the dependencies
    --precision           whether flow-sensitive
    --entry-point [MODULEENTRY ...]
                          Entry functions to be processed
    -o OUTPUT, --output OUTPUT
                          Output call graph path

Example 1: analyze bpytop.py in E.A. mode.

$ python3 tool/Jarvis/jarvis_cli.py dataset/macro-benchmark/pj/bpytop/bpytop.py --package dataset/macro-benchmark/pj/bpytop -o jarvis.json

Example 2: analyze bpytop.py in D.W. mode. Note we should prepare all the dependencies in the virtual environment.

# create virtualenv environment
$ virtualenv venv python=python3.8
# install Dependencies in virtualenv environment
$ python3 -m pip install psutil
# run jarvis
$ python3 tool/Jarvis/jarvis_cli.py dataset/macro-benchmark/pj/bpytop/bpytop.py --package dataset/macro-benchmark/pj/bpytop --decy -o jarvis.jso

Evaluation

RQ1 and RQ2 Setup

cd to the root directory of the unzipped files.

# 1. run micro_benchmark
$ ./reproducing_RQ12_setup/micro_benchmark/test_All.sh
# 2. run macro_benchmark
$ ./reproducing_RQ12_setup/macro_benchmark/pycg_EA.sh
#     PyCG iterates once
$ ./reproducing_RQ12_setup/macro_benchmark/pycg_EW.sh 1
#     PyCG iterates twice
$ ./reproducing_RQ12_setup/macro_benchmark/pycg_EW.sh 2
#     PyCG iterates to convergence 
$ ./reproducing_RQ12_setup/macro_benchmark/pycg_EW.sh
$ ./reproducing_RQ12_setup/macro_benchmark/jarvis_DA.sh
$ ./reproducing_RQ12_setup/macro_benchmark/jarvis_EA.sh
$ ./reproducing_RQ12_setup/macro_benchmark/jarvis_DW.sh

RQ1. Scalability Evaluation

Scalability results

Run

$ python3 ./reproducing_RQ1/gen_table.py

The results are shown below:

scalability

AGs and FAGs

Run

$ python3 ./reproducing_RQ1/FAG/plot.py

The generated graphs are pycg-ag.pdf, pycg-change-ag.pdf and jarvis-fag.pdf, where they represents Fig. 9a, Fig. 9b and Fig 10, correspondingly.

RQ2. Accuracy Evaluation

Accuracy results

Run

$ python3 ./reproducing_RQ2/gen_table.py     

The generated results:

accuracy

Case Study: Fine-grained Tracking of Vulnerable Dependencies

1. Target projects

Fastapi, Httpie, Scrapy, Lightning, Airflow,sherlock,wagtail

2. Vulnerable libraries in Top 10 dependencies

The CVEs of html , numpy , lxml,psutil don't relate to Python , we don't care them.

3. Vulnerable projects using dependency analysis

sherlock
- sherlock.sherlock
  - requests(v2.28.0)
    - urllib3(v1.26.0) ---- [CVE-2021-33503,CVE-2019-11324,CVE-2019-11236,CVE-2020-7212]
- sherlock.sites
  - requests(v.2.28.0)
    - urllib3(v1.26.0) ---- [CVE-2021-33503,CVE-2019-11324,CVE-2019-11236,CVE-2020-7212]
airflow
- airflow.kubernetes.kube_client
  - urllib3(v1.26.0) ---- [CVE-2021-33503,CVE-2019-11324,CVE-2019-11236,CVE-2020-7212]
- airflow.providers.cncf.kubernetes.operators.pod
  - urllib3(v1.26.0) ---- [CVE-2021-33503,CVE-2019-11324,CVE-2019-11236,CVE-2020-7212]
- airflow.providers.cncf.kubernetes.utils.pod_manager
  - urllib3(v1.26.0) ---- [CVE-2021-33503,CVE-2019-11324,CVE-2019-11236,CVE-2020-7212]
- airflow.executors.kubernetes_executor
  - urllib3(v1.26.0) ---- [CVE-2021-33503,CVE-2019-11324,CVE-2019-11236,CVE-2020-7212]
......
wagtail
- wagtail.contrib.frontend_cache.backends
  - requests(v2.28.0)
    - urllib3(v1.26.0) ---- [CVE-2021-33503,CVE-2019-11324,CVE-2019-11236,CVE-2020-7212]
Httpie
- httpie.client
  - urllib3(v1.26.0) ---- [CVE-2021-33503,CVE-2019-11324,CVE-2019-11236,CVE-2020-7212]
- httpie.ssl_
  - urllib3(v1.26.0) ---- [CVE-2021-33503,CVE-2019-11324,CVE-2019-11236,CVE-2020-7212]
- httpie.models
  - urllib3(1.26.0) ---- [CVE-2021-33503,CVE-2019-11324,CVE-2019-11236,CVE-2020-7212]
Scrapy
- scrapy.downloadermiddlewares.cookies
  - tldextract(v3.4.4)
    - requests(v2.28.0)
      - urllib3(v1.26.0) ---- [CVE-2021-33503,CVE-2019-11324,CVE-2019-11236,CVE-2020-7212]
Lightning
- lightning.app.utilities.network
  - requests(v2.28.0)
    - urllib3(v1.26.0) ---- [CVE-2021-33503,CVE-2019-11324,CVE-2019-11236,CVE-2020-7212]
- lightning.app.utilities.network
  - requests(v2.28.0)
    - urllib3(v1.26.0) ---- [CVE-2021-33503,CVE-2019-11324,CVE-2019-11236,CVE-2020-7212]
- lightning.app.utilities.network
  - requests(v2.28.0)
    - urllib3(v1.26.0) ---- [CVE-2021-33503,CVE-2019-11324,CVE-2019-11236,CVE-2020-7212]
...

4. Vulnerable projects using method-level invocation analysis

Fastapi

According to the patch commit, the vulnerable method of CVE-2021-33503 in urllib3 is urllib3.util.url.

Below is the method-level invocation path:

Httpie
- httpie.apapters.<main>
  - requests.adapters.<main>
    - urllib3.contrib.socks.<main>
      - Urllib3.util.url.<main> ---- CVE-2021-33503
Scrapy
- scrapy.downloadermiddlewares.cookies.<main>
  - tldextract.__init__.<main>
    - tldextract.tldextract.<main>
      - tldextract.suffix_list.<main>
        - requests_file.<main>
          - requests.adapters.<main>
            - Urllib3.util.url.<main> ---- CVE-2021-33503
Lighting
- lightning.app.utilities.network.<main>
  - requests.adapters.<main>
    - urllib3.contrib.socks.<main>
      - Urllib3.util.url.<main> ---- CVE-2021-33503
Airflow
- airflow.providers.amazon.aws.hooks.base_aws.BaseSessionFactory._get_idp_response
  - requests.adapters.<main>
    - urllib3.contrib.sock.<main>
      - urllib3.util.url.<main> ---- CVE-2021-33503

PS:

represents body code block of python file.(Because python doesn't need entry function)

Acknowledgements

Our artifact has reused part of the functionalities from third party libraries. i.e., PyCG.

Vitalis Salis et al. PyCG: Practical Call Graph Generation in Python. In 43rd International Conference on Software Engineering (ICSE), 25–28 May 2021.