Zur Webseite der GI

Mittlerweile gibt es einige Forscher in Deutschland, die sich intensiv mit verschiedenen Aspekten rund um „Graph Data Management“ beschäftigen. Eine Vernetzung und regelmäßiger Austausch zwischen Forschern findet jedoch bisher nur partiell statt. Das erste Graph Community Treffen bietet an Graphdaten Interessierten die Möglichkeit sich kennen zu lernen, Forschungsergebnisse vorzustellen, Ideen zu diskutieren, Kontakte zu knüpfen, etc.


9:15 - 10:45 Modellierung und Simulation des öffentlichen Linienverkehrs
Ralf Rückert – MLU Halle/Wittenberg

Das Netzwerk des Bahnverkehrs ist in der Praxis mehr als nur das speichern einiger Fahrpläne. Um realistische Prozesse abzubilden bedarf es einer geschickten Modellierung und Verwaltung eines dynamischen Netzwerks. Dieses wird verwendet werden, um im live-Betrieb kundenorientierte Empfehlungen beim Störungsmanagement zu machen.

GPU-basierte Berechnung kürzester Wege im Bahnverkehr
Steffen Rechner – MLU Halle/Wittenberg

Die Bestimmung kürzester Wege in Transportnetzwerken ist eine grundlegende Aufgabe in Anwendungen wie der Fahrplanauskunft oder Reisendensimulation. Dabei ist eine effiziente Umsetzung von Kürzeste-Wege-Anfragen entscheidend für die Echtzeitfähigkeit des Systems. Es wird gezeigt, wie mittels GPU-Computing und Algorithm Engineering eine effiziente Implementierung erreicht werden kann.

11:00 - 12:30 LogiScale – BigData in Logistics
Karl Däubel – TU Berlin

In meinem Vortrag werde ich das Forschungsprojekt "LogiScale" vorstellen. In diesem Projekt soll eine multiskalen Repräsentation von Logistiknetzwerken konzipiert und entwickelt werden. Eine multiskalen Repräsentation ermöglicht die Darstellung von Kosten-, Zeit- oder Ortsdaten in verschiedenen Größenordnungen (Skalen). Dabei werden detaillierte Erkenntnisse und Lösungen auf kleinen Skalen aggregiert, mit gröberen Lösungen auf höheren Skalen gekoppelt und für die Berechnung in Optimierungsalgorithmen bereitgestellt.

Graph-based Analysis of Dynamic Systems
Benjamin Schiller – TU Dresden

The analysis of dynamic systems provides insights into their time-dependent characteristics. While various approaches have been developed to analyze dynamic graphs, it is not always clear which one performs best for the analysis of a specific graph. Hence, tools are required to benchmark and compare different algorithms for the computation of graph properties and data structures for the representation of dynamic graphs in memory. Based on deeper insights into their performance, new algorithms can be developed and efficient data structures can be selected. To address these challenges, we present a benchmarking framework for dynamic graph analysis, novel algorithms for the efficient analysis of dynamic graphs, an approach for the parallelization of dynamic graph analysis, and a novel paradigm to select and adapt graph data structures. In addition, we present three use cases from the areas of social, computer, and biological networks to illustrate the great insights provided by their graph-based analysis.

13:30 - 15:00 Distributed Graph Analytics with Gradoop
Martin Junghanns – Uni Leipzig

In this talk, I will give an overview on Gradoop, our open-source framework for distributed graph analytics. I will explain the implemented data model and highlight a subset of available graph operators. My focus will be on graph grouping and graph pattern matching. I will show how the operators are implemented using the underlying dataflow framework and show some experimental results.

Distributed Transactional Frequent Subgraph Mining with Gradoop
André Petermann – Uni Leipzig

I will talk about the graph-transaction setting of frequent subgraph mining and the implementation challenges on Apache Flink. In this setting, we are interested in subgraphs which occur in a minimum number of graphs of a collection. My focus will be on the various factors that impact scalability like minimizing global knowledge exchange.

15:15 - 16:45 Towards a Novel Concept for High Performance Graph Processing
Matthias Hauck – Uni Heidelberg

With the increasing importance of graph-structured data and the rising number of applications relying on a graph data model, there is a growing need to store, manipulate, and analyze graph data efficiently. To satisfy this need, a plethora of solutions, tailored for specific use cases and environments, has been proposed. This includes graph database management systems that provide a high-level abstraction, query optimization, transactions, and support for dynamic graphs, and graph processing engines that provide contrary a low-level abstraction and superior performance for graph algorithms. As of now, combining the advantages of both classes requires expensive data transfer and poses data synchronization challenges. We propose a graph engine concept that combines the advantages of a high-level graph interface with the performance advantages of a high-performance graph processing engine. Our concept fosters an adaptive execution strategy that handles dynamic graphs, multiple vertex and edge attributes, and concurrently running graph operations gracefully.

Energy Aware Graph Processing
Alexander Krause – TU Dresden

Efficient processing of large graphs on scale-out machines is a challenging task. The high amount of cores, distributed among multiple sockets, implies data partitioning. This talk will give basic insights about graph processing on a NUMA machine with respect to energy awareness and our possibilities to save energy from a software side.

Efficient Compression of Large Graphs
Jan Broß – KIT

Webgraphs are a representation of the linkage structure of the web. They represent the relationship of a certain set of URLs. Each URL in the set is represented with a node. An arc runs between two nodes u and v whenever page u contains a hyperlink to page v. Webgraph compression has attracted a lot of research leading to various compression schemes exploiting typical properties of webgraphs. Up to now, however, the time needed for constructing the compressed representation was only partially addressed. We present a parallel construction technique for a competitive webgraph representation called k^2 trees. The technique allows for fast compression of billion node graphs to 1-4 Bits per Edge. Furthermore we present a technique to accelerate query times by a factor of two compared to previous work.

17:00 - 19:00 Path Sharing for Regular Path Queries
Frank Tetzel – TU Dresden

Regular Path Queries (RPQs) are a major part of recent graph query languages like SPARQL and PGQL. They allow the definition of recursive path structures through regular expressions on the alphabet of edge labels. On large graphs the size of matching paths can be huge, especially when the recursion bound is unlimited. To cope with exploding intermediate results different processing strategies have been proposed. The regular expression can be split in subparts at Rare Labels which reduces the search space heavily. A reachability index can be leveraged to answer short recursive subparts of a path. For longer recursive subparts we propose Path Sharing to only visit a region of the data graph once and share the results with all paths connected to this region.

Platform transparent processing of Linked Data with Piglet
Stefan Hagedorn – TU Ilmenau

Data exploration implies the incremental analysis of data sets, starting with data cleaning and removing invalid entries and then finding a way to the information of interest. For this, the information contained in one data set often has to be combined with data from other sources, which in many cases is available as Linked Data/RDF format. The de-facto standard query language for RDF data is SPARQL whose main part are basic graph patterns (BGP) that are used to formulate the query. For processing the potentially large data sets, platforms like Hadoop and Spark emerged and for Hadoop, the Pig Latin language became popular as it takes the burden from the users to write plain MapReduce programs. However, original Pig Latin has no support for SPARQL and can only formulate such queries using complex and cumbersome Join/Filter constructs. Our Piglet project aims at combining these two worlds: It is an compiler and code generator that transforms Pig Latin input into programs for Hadoop, Spark, and Flink. Additionally, it extends the original Pig Latin by introducing a tuplified data structure and the BGP filter operator to efficiently query the RDF graphs.

Efficient Graph Processing on Business Data in SAP HANA
Marcus Paradies – SAP

Evermore, enterprises from various domains, such as the financial, insurance, and pharmaceutical industry, explore and analyze the connections between data records in traditional customer-relationship management and enterprise-resource-planning systems. Typically, these industries rely on mature RDBMS technology to retain a single source of truth and access. Although graph structure is already latent in the relational schema and inherently represented in foreign key relationships, managing native graph data is moving into focus as it allows rapid application development in the absence of an upfront defined database schema. In this presentation I will give a technical overview of the SAP HANA Graph engine, its architecture, and discuss major design decisions. Based on selected use cases, I will demonstrate the two language interfaces that we currently expose, a declarative, pattern matching-based query language and an imperative query language for writing custom graph algorithms against SAP HANA Graph. I will conclude with a list of open research challenges in graph data management and potential application areas for graph from an enterprise perspective.