Code development platform for open source projects from the European Union institutions :large_blue_circle: EU Login authentication by SMS has been phased out. To see alternatives please check here

Skip to content
Snippets Groups Projects
Select Git revision
  • development
  • release/1.0
2 results

coretide

  • Clone with SSH
  • Clone with HTTPS
  • Website Evidence Collector Logo

    Website Evidence Collector

    The tool Website Evidence Collector (WEC) automates the website evidence collection of storage and transfer of personal data. It is based on the browser Chromium/Chrome and its JavaScript software library for automation puppeteer.

    Table of Contents

    ⚡⚡ Quick Start

    First, make sure Node.js (minimum version is 20) and npm1 are installed. Check running node -v or install it by following the guide on the Node.js website. Linux users can also use their package manager (e.g., apt install nodejs). Check Repology for your distribution.

    Second, add our package registry and install the latest version of the Website Evidence Collector.

    $ echo @EDPS:registry=https://code.europa.eu/api/v4/projects/812/packages/npm/ >> ~/.npmrc
    $ npm install --global @EDPS/website-evidence-collector

    Third, run a collection.

    $ website-evidence-collector https://example.com

    Lastly, uninstall the tool using:

    $ npm uninstall --global website-evidence-collector

    Screencast Installation

    Troubleshooting: Permissions

    If you encounter permission denied errors during installation, try the following commands:

    mkdir "${HOME}/.npm-packages"
    npm config set prefix "${HOME}/.npm-packages"

    Run Website Evidence Collector

    The WEC can be run in two ways. Either using the collect command on the command line, saving its output in a folder or using the serve command starting a webserver which can be accessed using the browser. The serve command is recommended for quick and simple scans.

    Notice on the Processing of Personal Data: This tool carries out automated processing of data of websites for the purpose of identifying their processing of personal data. If you run the tool to visit web pages containing personal data, this tool will download, display, and store these personal data in the form of text files and screenshots, and you will therefore process personal data.

    Hint: If you run into command not found errors you have to add the .npm-packages to your PATH.
    Run the following commands:

    NPM_PACKAGES="${HOME}/.npm-packages"  
    export PATH="$PATH:$NPM_PACKAGES/bin"

    You can check your PATH with this command: echo $PATH.

    serve

    Screencast Call

    The serve command starts a local web server to display the collected evidence. By default, the website is available at http://localhost:8080/.

    $ website-evidence-collector serve

    You can customize the server port and browser options:

    • Use -p to specify a different port.
    • Use --browser-options to pass additional options to the internal Chromium browser.

    Example with custom port and browser options:

    website-evidence-collector serve -p 8081 --browser-options='--disable-webgl' --browser-options='--disable-gpu'

    collect

    Screencast Call

    The collect command is the default command for WEC when no other options are provided. It runs a collection from the terminal and saves the result in the output folder by default.

    Basic Usage

    $ website-evidence-collector https://example.com

    Options

    1. Simple output on the terminal only:
    $ website-evidence-collector --no-output --yaml https://example.com 2> /dev/null

    This displays the output on the terminal and redirects logging to /dev/null.

    2. Ignore certificate errors during collection:
    $ website-evidence-collector -y -q https://untrusted-root.badssl.com -- --ignore-certificate-errors

    This ignores certificate errors when collecting data from the specified URL.

    All command line arguments after -- (the second in case of npm) are applied to launch Chromium.

    Reference: https://peter.sh/experiments/chromium-command-line-switches/#ignore-certificate-errors

    Integrate with testssl.sh:

    Note: Testssl.sh v3.0 or higher must be already installed. The most recent and with WEC tested version is v3.0.6.

    With the option --testssl, the website evidence collector calls testssl.sh to gather information about the HTTPS/SSL connection.

    a. Basic usage:

    $ website-evidence-collector --testssl https://example.com

    b. Specify testssl.sh executable location:

    $ website-evidence-collector -q --testssl-executable ../testssl.sh-3.0.6/testssl.sh https://example.com

    c. Use a pre-existing testssl.sh JSON output file:

    $ website-evidence-collector --testssl-file example-testssl.json https://example.com

    🐋 Using Docker or Podman

    Docker/Podman containers are available under https://code.europa.eu/EDPS/website-evidence-collector/container_registry.

    • To run the WEC server, forward the port:

       $ docker run -p 8080:8080 code.europa.eu:4567/edps/website-evidence-collector:latest
    • To collect evidence and save output, map a volume:

      $ docker run -v /path/on/your/system:/output:z --userns=keep-id code.europa.eu:4567/edps/website-evidence-collector:latest collect https://example.com
    • Or build your own image using the Containerfile:

       $ docker build -t website-evidence-collector -f Containerfile
    • The container accepts the version of testssl.sh used through the environment variable TESTSSL_VERSION.

    Frequently Asked Questions

    Please find a collection of frequently asked questions with answers in FAQ.md

    Setup of the Development Environment

    1. Install the dependencies according to the Installation Guide point 1.
    2. Clone the Repository using Git
      $ git clone https://code.europa.eu/EDPS/website-evidence-collector.git`.
    3. Open the terminal and navigate to the folder website-evidence-collector.
    4. Install the dependencies and compile TypeScript
      $ npm install
      $ npm run install-frontend-dependencies
      $ npm build
    5. Consider to use npm link to make the command website-evidence-collector available outside the project folder.

    TODO List

    Third-Party Software

    The following software extends WEC to cover further use cases. It is developed independently of the WEC and is not tested or approved by the WEC developers.

    Resources for Developers

    Contributors

    License

    This work, excluding filter lists, is distributed under the European Union Public Licence (the ‘EUPL’). Please find the terms in the file LICENSE.txt.

    Filter lists in the assets/ directory are authored by the EasyList authors (https://easylist.to/) and are for your convenience distributed together with this work under their respective license as indicated in their file headers.

    1. npm stands for Node.js package manager.