|
PySpark Plaso
Release 2019
A tool for distributed extraction of timestamps from various files using extractors adapted from the Plaso engine to Apache Spark.
|


Public Member Functions | |
| def | __init__ (self, hdfs_base_uri, spark_context) |
| def | extract (self, hdfs_path="") |
Public Member Functions inherited from plaso.tarzan.app.controllers.controller.Controller | |
| def | __init__ (self, hdfs_base_uri) |
| def | make_hdfs_uri (self, hdfs_path) |
| def | strip_hdfs_uri (self, hdfs_path) |
Public Attributes | |
| spark_context | |
Public Attributes inherited from plaso.tarzan.app.controllers.controller.Controller | |
| hdfs_base_uri | |
Controller for extraction of events by the Palso.
| def plaso.tarzan.app.controllers.plasocontroller.PlasoController.__init__ | ( | self, | |
| hdfs_base_uri, | |||
| spark_context | |||
| ) |
Create a new controller that will be utilizing HDFS URI and SparkContext. :param hdfs_base_uri: the base HDFS URI to store :param spark_context: the Spark context
| def plaso.tarzan.app.controllers.plasocontroller.PlasoController.extract | ( | self, | |
hdfs_path = "" |
|||
| ) |
Run Plaso Extractors on a given HDFS path to generate events. :param hdfs_path: the path where to extract events from :return: the Flask Response with a JSON document of extracted events
| plaso.tarzan.app.controllers.plasocontroller.PlasoController.spark_context |
1.8.15