Skip to content

Commit 7e90283

Browse files
bbonninbzz
authored andcommitted
Add an Elasticsearch interpreter
### Elasticsearch Interpreter Interpreter for querying ElasticSearch . Supported requests are "get document by id" , "search documents" , "delete by id" , "count documents" and "index / update a document". Supported versions of Elasticsearch : >= 2.1 Author: Bruno Bonnin <[email protected]> Author: Bruno Bonnin <[email protected]> Closes apache#520 from bbonnin/master and squashes the following commits: e40c06d [Bruno Bonnin] Remove duplicate dependency license (same groupid/artifactid) 98822df [Bruno Bonnin] Merge branch 'master' of https://github.com/apache/incubator-zeppelin 32ea103 [Bruno Bonnin] Update elasticsearch.md e8a9ff2 [Bruno Bonnin] Merge branch 'master' of https://github.com/bbonnin/incubator-zeppelin 34a39a9 [Bruno Bonnin] Update tests with new search format a3cf78c [Bruno Bonnin] Update elasticsearch.md af319e6 [Bruno Bonnin] Update snapshots ce7c15f [Bruno Bonnin] Update search command (use of query_string) + size config 6b6886b [Bruno Bonnin] Update doc with query DSL 2dfd129 [Bruno Bonnin] Check client before starting process (to get a better error msg) 7bf4232 [Bruno Bonnin] Doc : count command with a query 93253df [Bruno Bonnin] Count command with a query d3b599c [Bruno Bonnin] Fix typo 25383db [Bruno Bonnin] Update doc (config, shield, completion) 3e1655d [Bruno Bonnin] Update elasticsearch.md ee1547a [Bruno Bonnin] Ugly table for flattened data 46899d6 [Bruno Bonnin] Doc: flattened json and security 4044169 [Bruno Bonnin] Add completion 31c73b5 [Bruno Bonnin] Update tests c68b3df [Bruno Bonnin] Update of LICENSE 455a072 [Bruno Bonnin] Fix pb from remarks of the PR a10a5ec [Bruno Bonnin] Elasticsearch Interpreter fdc413f [Bruno Bonnin] Elasticsearch Interpreter
1 parent 4c269e6 commit 7e90283

File tree

16 files changed

+1048
-2
lines changed

16 files changed

+1048
-2
lines changed

conf/zeppelin-site.xml.template

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@
105105

106106
<property>
107107
<name>zeppelin.interpreters</name>
108-
<value>org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.angular.AngularInterpreter,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.hive.HiveInterpreter,org.apache.zeppelin.tajo.TajoInterpreter,org.apache.zeppelin.flink.FlinkInterpreter,org.apache.zeppelin.lens.LensInterpreter,org.apache.zeppelin.ignite.IgniteInterpreter,org.apache.zeppelin.ignite.IgniteSqlInterpreter,org.apache.zeppelin.cassandra.CassandraInterpreter,org.apache.zeppelin.geode.GeodeOqlInterpreter,org.apache.zeppelin.postgresql.PostgreSqlInterpreter,org.apache.zeppelin.phoenix.PhoenixInterpreter,org.apache.zeppelin.kylin.KylinInterpreter</value>
108+
<value>org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.angular.AngularInterpreter,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.hive.HiveInterpreter,org.apache.zeppelin.tajo.TajoInterpreter,org.apache.zeppelin.flink.FlinkInterpreter,org.apache.zeppelin.lens.LensInterpreter,org.apache.zeppelin.ignite.IgniteInterpreter,org.apache.zeppelin.ignite.IgniteSqlInterpreter,org.apache.zeppelin.cassandra.CassandraInterpreter,org.apache.zeppelin.geode.GeodeOqlInterpreter,org.apache.zeppelin.postgresql.PostgreSqlInterpreter,org.apache.zeppelin.phoenix.PhoenixInterpreter,org.apache.zeppelin.kylin.KylinInterpreter,org.apache.zeppelin.elasticsearch.ElasticsearchInterpreter</value>
109109
<description>Comma separated interpreter configurations. First interpreter become a default</description>
110110
</property>
111111

Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading

docs/interpreter/elasticsearch.md

Lines changed: 228 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,228 @@
1+
---
2+
layout: page
3+
title: "Elasticsearch Interpreter"
4+
description: ""
5+
group: manual
6+
---
7+
{% include JB/setup %}
8+
9+
10+
## Elasticsearch Interpreter for Apache Zeppelin
11+
12+
### 1. Configuration
13+
14+
<br/>
15+
<table class="table-configuration">
16+
<tr>
17+
<th>Property</th>
18+
<th>Default</th>
19+
<th>Description</th>
20+
</tr>
21+
<tr>
22+
<td>elasticsearch.cluster.name</td>
23+
<td>elasticsearch</td>
24+
<td>Cluster name</td>
25+
</tr>
26+
<tr>
27+
<td>elasticsearch.host</td>
28+
<td>localhost</td>
29+
<td>Host of a node in the cluster</td>
30+
</tr>
31+
<tr>
32+
<td>elasticsearch.port</td>
33+
<td>9300</td>
34+
<td>Connection port <b>(important: this is not the HTTP port, but the transport port)</b></td>
35+
</tr>
36+
<tr>
37+
<td>elasticsearch.result.size</td>
38+
<td>10</td>
39+
<td>The size of the result set of a search query</td>
40+
</tr>
41+
</table>
42+
43+
<center>
44+
![Interpreter configuration](../assets/themes/zeppelin/img/docs-img/elasticsearch-config.png)
45+
</center>
46+
47+
48+
> Note #1: you can add more properties to configure the Elasticsearch client.
49+
50+
> Note #2: if you use Shield, you can add a property named `shield.user` with a value containing the name and the password (format: `username:password`). For more details about Shield configuration, consult the [Shield reference guide](https://www.elastic.co/guide/en/shield/current/_using_elasticsearch_java_clients_with_shield.html). Do not forget, to copy the shield client jar in the interpreter directory (`ZEPPELIN_HOME/interpreters/elasticsearch`).
51+
52+
53+
<hr/>
54+
55+
### 2. Enabling the Elasticsearch Interpreter
56+
57+
In a notebook, to enable the **Elasticsearch** interpreter, click the **Gear** icon and select **Elasticsearch**.
58+
59+
60+
<hr/>
61+
62+
63+
### 3. Using the Elasticsearch Interpreter
64+
65+
In a paragraph, use `%elasticsearch` to select the Elasticsearch interpreter and then input all commands. To get the list of available commands, use `help`.
66+
67+
```bash
68+
| %elasticsearch
69+
| help
70+
Elasticsearch interpreter:
71+
General format: <command> /<indices>/<types>/<id> <option> <JSON>
72+
- indices: list of indices separated by commas (depends on the command)
73+
- types: list of document types separated by commas (depends on the command)
74+
Commands:
75+
- search /indices/types <query>
76+
. indices and types can be omitted (at least, you have to provide '/')
77+
. a query is either a JSON-formatted query, nor a lucene query
78+
- size <value>
79+
. defines the size of the result set (default value is in the config)
80+
. if used, this command must be declared before a search command
81+
- count /indices/types <query>
82+
. same comments as for the search
83+
- get /index/type/id
84+
- delete /index/type/id
85+
- index /ndex/type/id <json-formatted document>
86+
. the id can be omitted, elasticsearch will generate one
87+
```
88+
89+
> Tip: use (CTRL + .) for completion
90+
91+
92+
#### get
93+
With the `get` command, you can find a document by id. The result is a JSON document.
94+
95+
```bash
96+
| %elasticsearch
97+
| get /index/type/id
98+
```
99+
100+
Example:
101+
![Elasticsearch - Get](../assets/themes/zeppelin/img/docs-img/elasticsearch-get.png)
102+
103+
104+
#### search
105+
With the `search` command, you can send a search query to Elasticsearch. There are two formats of query:
106+
* You can provide a JSON-formatted query, that is exactly what you provide when you use the REST API of Elasticsearch.
107+
* See [Elasticsearch search API reference document](https://www.elastic.co/guide/en/elasticsearch/reference/current/search.html) for more details about the content of the search queries.
108+
* You can also provide the content of a `query_string`
109+
* This is a shortcut to a query like that: `{ "query": { "query_string": { "query": "__HERE YOUR QUERY__", "analyze_wildcard": true } } }`
110+
* See [Elasticsearch query string syntax](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#query-string-syntax) for more details about the content of such a query.
111+
112+
```bash
113+
| %elasticsearch
114+
| search /index1,index2,.../type1,type2,... <JSON document containing the query or query_string elements>
115+
```
116+
117+
If you want to modify the size of the result set, you can add a line that is setting the size, before your search command.
118+
119+
```bash
120+
| %elasticsearch
121+
| size 50
122+
| search /index1,index2,.../type1,type2,... <JSON document containing the query or query_string elements>
123+
```
124+
125+
126+
Examples:
127+
* With a JSON query:
128+
```bash
129+
| %elasticsearch
130+
| search / { "query": { "match_all": {} } }
131+
132+
| %elasticsearch
133+
| search /logs { "query": { "query_string": { "query": "request.method:GET AND status:200" } } }
134+
```
135+
136+
* With query_string elements:
137+
```bash
138+
| %elasticsearch
139+
| search /logs request.method:GET AND status:200
140+
141+
| %elasticsearch
142+
| search /logs (404 AND (POST OR DELETE))
143+
```
144+
145+
> **Important**: a document in Elasticsearch is a JSON document, so it is hierarchical, not flat as a row in a SQL table.
146+
For the Elastic interpreter, the result of a search query is flattened.
147+
148+
Suppose we have a JSON document:
149+
```json
150+
{
151+
"date": "2015-12-08T21:03:13.588Z",
152+
"request": {
153+
"method": "GET",
154+
"url": "/zeppelin/4cd001cd-c517-4fa9-b8e5-a06b8f4056c4",
155+
"headers": [ "Accept: *.*", "Host: apache.org"]
156+
},
157+
"status": "403"
158+
}
159+
```
160+
161+
The data will be flattened like this:
162+
163+
date | request.headers[0] | request.headers[1] | request.method | request.url | status
164+
-----|--------------------|--------------------|----------------|-------------|-------
165+
2015-12-08T21:03:13.588Z | Accept: \*.\* | Host: apache.org | GET | /zeppelin/4cd001cd-c517-4fa9-b8e5-a06b8f4056c4 | 403
166+
167+
168+
Examples:
169+
* With a table containing the results:
170+
![Elasticsearch - Search - table](../assets/themes/zeppelin/img/docs-img/elasticsearch-search-table.png)
171+
172+
173+
* You can also use a predefined diagram:
174+
![Elasticsearch - Search - diagram](../assets/themes/zeppelin/img/docs-img/elasticsearch-search-pie.png)
175+
176+
* With a JSON query:
177+
![Elasticsearch - Search with query](../assets/themes/zeppelin/img/docs-img/elasticsearch-search-json-query-table.png)
178+
179+
* With a query string:
180+
![Elasticsearch - Search with query string](../assets/themes/zeppelin/img/docs-img/elasticsearch-query-string.png)
181+
182+
183+
#### count
184+
With the `count` command, you can count documents available in some indices and types. You can also provide a query.
185+
186+
```bash
187+
| %elasticsearch
188+
| count /index1,index2,.../type1,type2,... <JSON document containing the query OR a query string>
189+
```
190+
191+
Examples:
192+
* Without query:
193+
![Elasticsearch - Count](../assets/themes/zeppelin/img/docs-img/elasticsearch-count.png)
194+
195+
* With a query:
196+
![Elasticsearch - Count with query](../assets/themes/zeppelin/img/docs-img/elasticsearch-count-with-query.png)
197+
198+
199+
#### index
200+
With the `index` command, you can insert/update a document in Elasticsearch.
201+
```bash
202+
| %elasticsearch
203+
| index /index/type/id <JSON document>
204+
205+
| %elasticsearch
206+
| index /index/type <JSON document>
207+
```
208+
209+
#### delete
210+
With the `delete` command, you can delete a document.
211+
212+
```bash
213+
| %elasticsearch
214+
| delete /index/type/id
215+
```
216+
217+
218+
219+
#### Apply Zeppelin Dynamic Forms
220+
221+
You can leverage [Zeppelin Dynamic Form]({{BASE_PATH}}/manual/dynamicform.html) inside your queries. You can use both the `text input` and `select form` parameterization features
222+
223+
```bash
224+
%elasticsearch
225+
size ${limit=10}
226+
search /index/type { "query": { "match_all": {} } }
227+
```
228+

elasticsearch/pom.xml

Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<!--
3+
~ Licensed to the Apache Software Foundation (ASF) under one or more
4+
~ contributor license agreements. See the NOTICE file distributed with
5+
~ this work for additional information regarding copyright ownership.
6+
~ The ASF licenses this file to You under the Apache License, Version 2.0
7+
~ (the "License"); you may not use this file except in compliance with
8+
~ the License. You may obtain a copy of the License at
9+
~
10+
~ http://www.apache.org/licenses/LICENSE-2.0
11+
~
12+
~ Unless required by applicable law or agreed to in writing, software
13+
~ distributed under the License is distributed on an "AS IS" BASIS,
14+
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
~ See the License for the specific language governing permissions and
16+
~ limitations under the License.
17+
-->
18+
19+
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
20+
<modelVersion>4.0.0</modelVersion>
21+
22+
<parent>
23+
<artifactId>zeppelin</artifactId>
24+
<groupId>org.apache.zeppelin</groupId>
25+
<version>0.6.0-incubating-SNAPSHOT</version>
26+
<relativePath>..</relativePath>
27+
</parent>
28+
29+
<groupId>org.apache.zeppelin</groupId>
30+
<artifactId>zeppelin-elasticsearch</artifactId>
31+
<packaging>jar</packaging>
32+
<version>0.6.0-incubating-SNAPSHOT</version>
33+
<name>Zeppelin: Elasticsearch interpreter</name>
34+
<url>http://www.apache.org</url>
35+
36+
<properties>
37+
<elasticsearch.version>2.1.0</elasticsearch.version>
38+
<guava.version>18.0</guava.version>
39+
<json-flattener.version>0.1.1</json-flattener.version>
40+
</properties>
41+
42+
<dependencies>
43+
<dependency>
44+
<groupId>org.apache.zeppelin</groupId>
45+
<artifactId>zeppelin-interpreter</artifactId>
46+
<version>${project.version}</version>
47+
<scope>provided</scope>
48+
</dependency>
49+
50+
<dependency>
51+
<groupId>org.elasticsearch</groupId>
52+
<artifactId>elasticsearch</artifactId>
53+
<version>${elasticsearch.version}</version>
54+
</dependency>
55+
56+
<dependency>
57+
<groupId>com.google.guava</groupId>
58+
<artifactId>guava</artifactId>
59+
<version>${guava.version}</version>
60+
</dependency>
61+
62+
<dependency>
63+
<groupId>com.github.wnameless</groupId>
64+
<artifactId>json-flattener</artifactId>
65+
<version>${json-flattener.version}</version>
66+
</dependency>
67+
68+
<dependency>
69+
<groupId>org.slf4j</groupId>
70+
<artifactId>slf4j-api</artifactId>
71+
</dependency>
72+
73+
<dependency>
74+
<groupId>junit</groupId>
75+
<artifactId>junit</artifactId>
76+
<scope>test</scope>
77+
</dependency>
78+
</dependencies>
79+
80+
<build>
81+
<plugins>
82+
<plugin>
83+
<groupId>org.apache.maven.plugins</groupId>
84+
<artifactId>maven-deploy-plugin</artifactId>
85+
<version>2.7</version>
86+
<configuration>
87+
<skip>true</skip>
88+
</configuration>
89+
</plugin>
90+
91+
<plugin>
92+
<artifactId>maven-enforcer-plugin</artifactId>
93+
<version>1.3.1</version>
94+
<executions>
95+
<execution>
96+
<id>enforce</id>
97+
<phase>none</phase>
98+
</execution>
99+
</executions>
100+
</plugin>
101+
102+
<plugin>
103+
<artifactId>maven-dependency-plugin</artifactId>
104+
<version>2.8</version>
105+
<executions>
106+
<execution>
107+
<id>copy-dependencies</id>
108+
<phase>package</phase>
109+
<goals>
110+
<goal>copy-dependencies</goal>
111+
</goals>
112+
<configuration>
113+
<outputDirectory>${project.build.directory}/../../interpreter/elasticsearch</outputDirectory>
114+
<overWriteReleases>false</overWriteReleases>
115+
<overWriteSnapshots>false</overWriteSnapshots>
116+
<overWriteIfNewer>true</overWriteIfNewer>
117+
<includeScope>runtime</includeScope>
118+
</configuration>
119+
</execution>
120+
<execution>
121+
<id>copy-artifact</id>
122+
<phase>package</phase>
123+
<goals>
124+
<goal>copy</goal>
125+
</goals>
126+
<configuration>
127+
<outputDirectory>${project.build.directory}/../../interpreter/elasticsearch</outputDirectory>
128+
<overWriteReleases>false</overWriteReleases>
129+
<overWriteSnapshots>false</overWriteSnapshots>
130+
<overWriteIfNewer>true</overWriteIfNewer>
131+
<includeScope>runtime</includeScope>
132+
<artifactItems>
133+
<artifactItem>
134+
<groupId>${project.groupId}</groupId>
135+
<artifactId>${project.artifactId}</artifactId>
136+
<version>${project.version}</version>
137+
<type>${project.packaging}</type>
138+
</artifactItem>
139+
</artifactItems>
140+
</configuration>
141+
</execution>
142+
</executions>
143+
</plugin>
144+
</plugins>
145+
</build>
146+
147+
</project>

0 commit comments

Comments
 (0)