You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@metron.apache.org by nickwallen <gi...@git.apache.org> on 2018/10/25 19:04:48 UTC

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

GitHub user nickwallen opened a pull request:

    https://github.com/apache/metron/pull/1247

    METRON-1845 Correct Test Data Load in Elasticsearch Integration Tests

    The Elasticsearch integration tests use the legacy Transport client to load test data into the search indexes before running the tests. Loading the test data like this does not accurately reflect how the indices will appear in a production environment.
    
    This should be changed to use our existing `ElasticsearchUpdateDao` to write the test data.  This ensures that any changes made to the 'write' portion of our Elasticsearch code will function correctly with the 'read' portion. This ensures that telemetry written into Elasticsearch by 'Indexing' can be read correctly by the Alerts UI.
    
    - [ ] Do not merge until after #1242.  This change is dependent on #1242.  Take a look at the last commit to view the changes specific to this PR.
    
    ## Testing
    
    This only changes the integration tests.  If the integration tests pass, we're golden.
    
    ## Pull Request Checklist
    
    - [ ] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
    - [ ] Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
    - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)?
    - [ ] Have you included steps to reproduce the behavior or problem that is being changed or addressed?
    - [ ] Have you included steps or a guide to how the change may be verified and tested manually?
    - [ ] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:
    - [ ] Have you written or updated unit tests and or integration tests to verify your changes?
    - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
    - [ ] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/nickwallen/metron METRON-1845

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/metron/pull/1247.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1247
    
----
commit a7c7dc287b4f9c99c6780b934a0b6f433a03aa04
Author: cstella <ce...@...>
Date:   2018-10-09T00:06:52Z

    Casey Stella - elasticsearch rest client migration base work

commit 10410ea9718a2a1b1d287fb4f22a6c98efb1fdaa
Author: Michael Miklavcic <mi...@...>
Date:   2018-10-09T00:07:22Z

    Update shade plugin version

commit a33a16872118175ed35729df6ddde2959e49ae2f
Author: Michael Miklavcic <mi...@...>
Date:   2018-10-09T15:56:08Z

    Fix es update dao test

commit 52c3c96d7657205e65d3bd0c0e35923a851911da
Author: Michael Miklavcic <mi...@...>
Date:   2018-10-09T21:26:16Z

    Merge with master. Fix es search integration tests

commit 4742832869f9512e11cab4a32109c15a2b17a92e
Author: Michael Miklavcic <mi...@...>
Date:   2018-10-11T18:45:06Z

    Merge branch 'master' into es-rest-client

commit 43809968320e586fd70411776140f3aa13a60195
Author: Michael Miklavcic <mi...@...>
Date:   2018-10-11T23:59:25Z

    Get shade plugin working with the new ES client and the ClassIndexTransformer Shade plugin transformer.

commit af03f6f036e96c742db39733bf8ebc2cbf229129
Author: Michael Miklavcic <mi...@...>
Date:   2018-10-19T02:56:52Z

    Introduce config classes for managing ES client configuration. Translate properties for new client.

commit 1c8eac2a173e1e24afc9e11163e456a9d90c93db
Author: Michael Miklavcic <mi...@...>
Date:   2018-10-23T22:21:32Z

    Resolve merge conflicts with master

commit 1a47ded7a36f9d391227973c7da2921305373283
Author: Michael Miklavcic <mi...@...>
Date:   2018-10-23T22:23:04Z

    Remove extra deps in metron-elasticsearch around log4j.

commit 54870d68e6f43e859367879bff7537f97c11d0bd
Author: Michael Miklavcic <mi...@...>
Date:   2018-10-24T00:33:25Z

    Fixes for dep version issues.

commit 554de87ae160aae1ae85afb3dbb01a220a9d6838
Author: Nick Allen <ni...@...>
Date:   2018-10-25T18:54:41Z

    METRON-1845 Correct Test Data Load in Elasticsearch Integration Tests

----


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r236771158
  
    --- Diff: metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/components/ElasticSearchComponent.java ---
    @@ -194,35 +215,41 @@ public Client getClient() {
         return client;
       }
     
    -  public BulkResponse add(String indexName, String sensorType, String... docs) throws IOException {
    +  public void add(UpdateDao updateDao, String indexName, String sensorType, String... docs)
    +          throws IOException, ParseException {
         List<String> d = new ArrayList<>();
         Collections.addAll(d, docs);
    -    return add(indexName, sensorType, d);
    +    add(updateDao, indexName, sensorType, d);
       }
     
    -  public BulkResponse add(String indexName, String sensorType, Iterable<String> docs)
    -      throws IOException {
    -    BulkRequestBuilder bulkRequest = getClient().prepareBulk();
    -    for (String doc : docs) {
    -      IndexRequestBuilder indexRequestBuilder = getClient()
    -          .prepareIndex(indexName, sensorType + "_doc");
    -
    -      indexRequestBuilder = indexRequestBuilder.setSource(doc);
    -      Map<String, Object> esDoc = JSONUtils.INSTANCE
    -          .load(doc, JSONUtils.MAP_SUPPLIER);
    -      indexRequestBuilder.setId((String) esDoc.get(Constants.GUID));
    -      Object ts = esDoc.get("timestamp");
    -      if (ts != null) {
    -        indexRequestBuilder = indexRequestBuilder.setTimestamp(ts.toString());
    -      }
    -      bulkRequest.add(indexRequestBuilder);
    -    }
    +  public void add(UpdateDao updateDao, String indexName, String sensorType, Iterable<String> docs)
    --- End diff --
    
    > To that end, if we're looking to route all of this through the ES component in that fashion, it might make sense to simply replace the internal private Client client; and instead use the new desired IndexDao for the proxied calls to ES.
    
    I don't think we're ready to do all that quite yet.  There is still some legacy functionality in `ElasticSearchComponent` that uses the underlying client for the old Admin API.  See [close](https://github.com/apache/metron/blob/fcd644ca77394d48d460c460b672a23d6594f49b/metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/components/ElasticSearchComponent.java#L292), [createIndexWithMapping](https://github.com/apache/metron/blob/fcd644ca77394d48d460c460b672a23d6594f49b/metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/components/ElasticSearchComponent.java#L228), and [start](https://github.com/apache/metron/blob/fcd644ca77394d48d460c460b672a23d6594f49b/metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/components/ElasticSearchComponent.java#L153).


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r236765044
  
    --- Diff: metron-platform/metron-solr/src/test/java/org/apache/metron/solr/integration/SolrUpdateIntegrationTest.java ---
    @@ -186,4 +195,114 @@ public void testHugeErrorFields() throws Exception {
         exception.expectMessage("Document contains at least one immense term in field=\"error_hash\"");
         getDao().update(errorDoc, Optional.of("error"));
       }
    +
    +  @Test
    +  @Override
    +  public void test() throws Exception {
    --- End diff --
    
    Ok, I found the problem. With my changes, the test data was getting loaded into both the ElasticsearchDao and the HBaseDao because I was using a MultiIndexDao to load the test data.  
    
    In master, only Elasticsearch gets loaded with the test data.  This was causing some of the test assumptions to fail.  I corrected how the data is loaded so it only loads Elasticsearch and matches the existing behavior.
    
    Thanks for pointing this out.


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/metron/pull/1247


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r236643962
  
    --- Diff: metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/ElasticsearchSearchIntegrationTest.java ---
    @@ -97,45 +118,63 @@ protected static InMemoryComponent startIndex() throws Exception {
         return es;
       }
     
    -  protected static void loadTestData() throws ParseException, IOException {
    +  protected static void loadTestData() throws Exception {
         ElasticSearchComponent es = (ElasticSearchComponent) indexComponent;
     
    +    // define the bro index template
    +    String broIndex = "bro_index_2017.01.01.01";
         JSONObject broTemplate = JSONUtils.INSTANCE.load(new File(broTemplatePath), JSONObject.class);
         addTestFieldMappings(broTemplate, "bro_doc");
    -    es.getClient().admin().indices().prepareCreate("bro_index_2017.01.01.01")
    -        .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +    es.getClient().admin().indices().prepareCreate(broIndex)
    +            .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +
    +    // define the snort index template
    +    String snortIndex = "snort_index_2017.01.01.02";
         JSONObject snortTemplate = JSONUtils.INSTANCE.load(new File(snortTemplatePath), JSONObject.class);
         addTestFieldMappings(snortTemplate, "snort_doc");
    -    es.getClient().admin().indices().prepareCreate("snort_index_2017.01.01.02")
    -        .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    -
    -    BulkRequestBuilder bulkRequest = es.getClient().prepareBulk()
    -        .setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);
    -    JSONArray broArray = (JSONArray) new JSONParser().parse(broData);
    -    for (Object o : broArray) {
    -      JSONObject jsonObject = (JSONObject) o;
    -      IndexRequestBuilder indexRequestBuilder = es.getClient()
    -          .prepareIndex("bro_index_2017.01.01.01", "bro_doc");
    -      indexRequestBuilder = indexRequestBuilder.setId((String) jsonObject.get("guid"));
    -      indexRequestBuilder = indexRequestBuilder.setSource(jsonObject.toJSONString());
    -      indexRequestBuilder = indexRequestBuilder
    -          .setTimestamp(jsonObject.get("timestamp").toString());
    -      bulkRequest.add(indexRequestBuilder);
    +    es.getClient().admin().indices().prepareCreate(snortIndex)
    +            .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    +
    +    // setup the classes required to write the test data
    +    AccessConfig accessConfig = createAccessConfig();
    +    ElasticsearchClient client = ElasticsearchUtils.getClient(createGlobalConfig());
    +    ElasticsearchRetrieveLatestDao retrieveLatestDao = new ElasticsearchRetrieveLatestDao(client);
    +    ElasticsearchColumnMetadataDao columnMetadataDao = new ElasticsearchColumnMetadataDao(client);
    +    ElasticsearchRequestSubmitter requestSubmitter = new ElasticsearchRequestSubmitter(client);
    +    ElasticsearchUpdateDao updateDao = new ElasticsearchUpdateDao(client, accessConfig, retrieveLatestDao);
    +    ElasticsearchSearchDao searchDao = new ElasticsearchSearchDao(client, accessConfig, columnMetadataDao, requestSubmitter);
    +
    +    // write the test documents for Bro
    +    List<String> broDocuments = new ArrayList<>();
    +    for (Object broObject: (JSONArray) new JSONParser().parse(broData)) {
    +      broDocuments.add(((JSONObject) broObject).toJSONString());
         }
    -    JSONArray snortArray = (JSONArray) new JSONParser().parse(snortData);
    -    for (Object o : snortArray) {
    -      JSONObject jsonObject = (JSONObject) o;
    -      IndexRequestBuilder indexRequestBuilder = es.getClient()
    -          .prepareIndex("snort_index_2017.01.01.02", "snort_doc");
    -      indexRequestBuilder = indexRequestBuilder.setId((String) jsonObject.get("guid"));
    -      indexRequestBuilder = indexRequestBuilder.setSource(jsonObject.toJSONString());
    -      indexRequestBuilder = indexRequestBuilder
    -          .setTimestamp(jsonObject.get("timestamp").toString());
    -      bulkRequest.add(indexRequestBuilder);
    +    es.add(updateDao, broIndex, "bro", broDocuments);
    +
    +    // write the test documents for Snort
    +    List<String> snortDocuments = new ArrayList<>();
    +    for (Object snortObject: (JSONArray) new JSONParser().parse(snortData)) {
    +      snortDocuments.add(((JSONObject) snortObject).toJSONString());
         }
    -    BulkResponse bulkResponse = bulkRequest.execute().actionGet();
    -    if (bulkResponse.hasFailures()) {
    -      throw new RuntimeException("Failed to index test data");
    +    es.add(updateDao, snortIndex, "snort", snortDocuments);
    +
    +    // wait until the test documents are visible
    +    assertEventually(() -> Assert.assertEquals(10, findAll(searchDao).getTotal()));
    --- End diff --
    
    > Are you able to get this to fail consistently? I ran it a number of times and couldn't get it to fail.
    
    Can you describe what you did? You just removed line 190 and the tests continue to work reliably?
    



---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r236765451
  
    --- Diff: metron-platform/metron-solr/src/test/java/org/apache/metron/solr/integration/SolrUpdateIntegrationTest.java ---
    @@ -186,4 +195,114 @@ public void testHugeErrorFields() throws Exception {
         exception.expectMessage("Document contains at least one immense term in field=\"error_hash\"");
         getDao().update(errorDoc, Optional.of("error"));
       }
    +
    +  @Test
    +  @Override
    +  public void test() throws Exception {
    +    List<Map<String, Object>> inputData = new ArrayList<>();
    +    for(int i = 0; i < 10;++i) {
    +      final String name = "message" + i;
    +      inputData.add(
    +              new HashMap<String, Object>() {{
    +                put("source.type", SENSOR_NAME);
    +                put("name" , name);
    +                put("timestamp", System.currentTimeMillis());
    +                put(Constants.GUID, name);
    +              }}
    +      );
    +    }
    +    addTestData(getIndexName(), SENSOR_NAME, inputData);
    +    List<Map<String,Object>> docs = null;
    +    for(int t = 0;t < MAX_RETRIES;++t, Thread.sleep(SLEEP_MS)) {
    +      docs = getIndexedTestData(getIndexName(), SENSOR_NAME);
    +      if(docs.size() >= 10) {
    +        break;
    +      }
    +    }
    +    Assert.assertEquals(10, docs.size());
    +    //modify the first message and add a new field
    +    {
    +      Map<String, Object> message0 = new HashMap<String, Object>(inputData.get(0)) {{
    +        put("new-field", "metron");
    +      }};
    +      String guid = "" + message0.get(Constants.GUID);
    +      Document update = getDao().replace(new ReplaceRequest(){{
    +        setReplacement(message0);
    +        setGuid(guid);
    +        setSensorType(SENSOR_NAME);
    +        setIndex(getIndexName());
    +      }}, Optional.empty());
    +
    +      Assert.assertEquals(message0, update.getDocument());
    +      Assert.assertEquals(1, getMockHTable().size());
    +      findUpdatedDoc(message0, guid, SENSOR_NAME);
    +      {
    +        //ensure hbase is up to date
    +        Get g = new Get(HBaseDao.Key.toBytes(new HBaseDao.Key(guid, SENSOR_NAME)));
    +        Result r = getMockHTable().get(g);
    +        NavigableMap<byte[], byte[]> columns = r.getFamilyMap(CF.getBytes());
    +        Assert.assertEquals(1, columns.size());
    +        Assert.assertEquals(message0
    +                , JSONUtils.INSTANCE.load(new String(columns.lastEntry().getValue())
    +                        , JSONUtils.MAP_SUPPLIER)
    +        );
    +      }
    +      {
    +        //ensure ES is up-to-date
    --- End diff --
    
    No longer a problem.


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r228299751
  
    --- Diff: metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/ElasticsearchSearchIntegrationTest.java ---
    @@ -97,48 +118,81 @@ protected static InMemoryComponent startIndex() throws Exception {
         return es;
       }
     
    -  protected static void loadTestData() throws ParseException, IOException {
    +  protected static void loadTestData() throws Exception {
         ElasticSearchComponent es = (ElasticSearchComponent) indexComponent;
     
    +    // define the bro index template
    +    String broIndex = "bro_index_2017.01.01.01";
         JSONObject broTemplate = JSONUtils.INSTANCE.load(new File(broTemplatePath), JSONObject.class);
         addTestFieldMappings(broTemplate, "bro_doc");
    -    es.getClient().admin().indices().prepareCreate("bro_index_2017.01.01.01")
    -        .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +    es.getClient().admin().indices().prepareCreate(broIndex)
    +            .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +
    +    // define the snort index template
    +    String snortIndex = "snort_index_2017.01.01.02";
         JSONObject snortTemplate = JSONUtils.INSTANCE.load(new File(snortTemplatePath), JSONObject.class);
         addTestFieldMappings(snortTemplate, "snort_doc");
    -    es.getClient().admin().indices().prepareCreate("snort_index_2017.01.01.02")
    -        .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    -
    -    BulkRequestBuilder bulkRequest = es.getClient().prepareBulk()
    -        .setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);
    -    JSONArray broArray = (JSONArray) new JSONParser().parse(broData);
    -    for (Object o : broArray) {
    -      JSONObject jsonObject = (JSONObject) o;
    -      IndexRequestBuilder indexRequestBuilder = es.getClient()
    -          .prepareIndex("bro_index_2017.01.01.01", "bro_doc");
    -      indexRequestBuilder = indexRequestBuilder.setId((String) jsonObject.get("guid"));
    -      indexRequestBuilder = indexRequestBuilder.setSource(jsonObject.toJSONString());
    -      indexRequestBuilder = indexRequestBuilder
    -          .setTimestamp(jsonObject.get("timestamp").toString());
    -      bulkRequest.add(indexRequestBuilder);
    +    es.getClient().admin().indices().prepareCreate(snortIndex)
    +            .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    +
    +    // setup the classes required to write the test data
    +    AccessConfig accessConfig = createAccessConfig();
    +    ElasticsearchClient client = ElasticsearchUtils.getClient(createGlobalConfig());
    +    ElasticsearchRetrieveLatestDao retrieveLatestDao = new ElasticsearchRetrieveLatestDao(client);
    +    ElasticsearchColumnMetadataDao columnMetadataDao = new ElasticsearchColumnMetadataDao(client);
    +    ElasticsearchRequestSubmitter requestSubmitter = new ElasticsearchRequestSubmitter(client);
    +    ElasticsearchUpdateDao updateDao = new ElasticsearchUpdateDao(client, accessConfig, retrieveLatestDao);
    +    ElasticsearchSearchDao searchDao = new ElasticsearchSearchDao(client, accessConfig, columnMetadataDao, requestSubmitter);
    --- End diff --
    
    All of this because I need an `UpdateDao` to write the tests data  and a `SearchDao` to make sure the test data is loaded.  Is there a simpler way?


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r236767632
  
    --- Diff: metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/ElasticsearchSearchIntegrationTest.java ---
    @@ -97,48 +118,81 @@ protected static InMemoryComponent startIndex() throws Exception {
         return es;
       }
     
    -  protected static void loadTestData() throws ParseException, IOException {
    +  protected static void loadTestData() throws Exception {
         ElasticSearchComponent es = (ElasticSearchComponent) indexComponent;
     
    +    // define the bro index template
    +    String broIndex = "bro_index_2017.01.01.01";
         JSONObject broTemplate = JSONUtils.INSTANCE.load(new File(broTemplatePath), JSONObject.class);
         addTestFieldMappings(broTemplate, "bro_doc");
    -    es.getClient().admin().indices().prepareCreate("bro_index_2017.01.01.01")
    -        .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +    es.getClient().admin().indices().prepareCreate(broIndex)
    +            .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +
    +    // define the snort index template
    +    String snortIndex = "snort_index_2017.01.01.02";
         JSONObject snortTemplate = JSONUtils.INSTANCE.load(new File(snortTemplatePath), JSONObject.class);
         addTestFieldMappings(snortTemplate, "snort_doc");
    -    es.getClient().admin().indices().prepareCreate("snort_index_2017.01.01.02")
    -        .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    -
    -    BulkRequestBuilder bulkRequest = es.getClient().prepareBulk()
    -        .setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);
    -    JSONArray broArray = (JSONArray) new JSONParser().parse(broData);
    -    for (Object o : broArray) {
    -      JSONObject jsonObject = (JSONObject) o;
    -      IndexRequestBuilder indexRequestBuilder = es.getClient()
    -          .prepareIndex("bro_index_2017.01.01.01", "bro_doc");
    -      indexRequestBuilder = indexRequestBuilder.setId((String) jsonObject.get("guid"));
    -      indexRequestBuilder = indexRequestBuilder.setSource(jsonObject.toJSONString());
    -      indexRequestBuilder = indexRequestBuilder
    -          .setTimestamp(jsonObject.get("timestamp").toString());
    -      bulkRequest.add(indexRequestBuilder);
    +    es.getClient().admin().indices().prepareCreate(snortIndex)
    +            .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    +
    +    // setup the classes required to write the test data
    +    AccessConfig accessConfig = createAccessConfig();
    +    ElasticsearchClient client = ElasticsearchUtils.getClient(createGlobalConfig());
    +    ElasticsearchRetrieveLatestDao retrieveLatestDao = new ElasticsearchRetrieveLatestDao(client);
    +    ElasticsearchColumnMetadataDao columnMetadataDao = new ElasticsearchColumnMetadataDao(client);
    +    ElasticsearchRequestSubmitter requestSubmitter = new ElasticsearchRequestSubmitter(client);
    +    ElasticsearchUpdateDao updateDao = new ElasticsearchUpdateDao(client, accessConfig, retrieveLatestDao);
    +    ElasticsearchSearchDao searchDao = new ElasticsearchSearchDao(client, accessConfig, columnMetadataDao, requestSubmitter);
    --- End diff --
    
    Yes! That worked.  Much cleaner. Thanks


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r236644723
  
    --- Diff: metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/components/ElasticSearchComponent.java ---
    @@ -194,35 +215,41 @@ public Client getClient() {
         return client;
       }
     
    -  public BulkResponse add(String indexName, String sensorType, String... docs) throws IOException {
    +  public void add(UpdateDao updateDao, String indexName, String sensorType, String... docs)
    +          throws IOException, ParseException {
         List<String> d = new ArrayList<>();
         Collections.addAll(d, docs);
    -    return add(indexName, sensorType, d);
    +    add(updateDao, indexName, sensorType, d);
       }
     
    -  public BulkResponse add(String indexName, String sensorType, Iterable<String> docs)
    -      throws IOException {
    -    BulkRequestBuilder bulkRequest = getClient().prepareBulk();
    -    for (String doc : docs) {
    -      IndexRequestBuilder indexRequestBuilder = getClient()
    -          .prepareIndex(indexName, sensorType + "_doc");
    -
    -      indexRequestBuilder = indexRequestBuilder.setSource(doc);
    -      Map<String, Object> esDoc = JSONUtils.INSTANCE
    -          .load(doc, JSONUtils.MAP_SUPPLIER);
    -      indexRequestBuilder.setId((String) esDoc.get(Constants.GUID));
    -      Object ts = esDoc.get("timestamp");
    -      if (ts != null) {
    -        indexRequestBuilder = indexRequestBuilder.setTimestamp(ts.toString());
    -      }
    -      bulkRequest.add(indexRequestBuilder);
    -    }
    +  public void add(UpdateDao updateDao, String indexName, String sensorType, Iterable<String> docs)
    --- End diff --
    
    > Might it be better to just use IndexDao, which extends UpdateDao, SearchDao, RetrieveLatestDao, ColumnMetadataDao? 
    
    Why would that be better?  It needs to do updates, so it needs an `UpdateDao`.  Seemed logical to me.


---

[GitHub] metron issue #1247: METRON-1845 Correct Test Data Load in Elasticsearch Inte...

Posted by mmiklavc <gi...@git.apache.org>.
Github user mmiklavc commented on the issue:

    https://github.com/apache/metron/pull/1247
  
    As a side note, I tend to like the approach of using the DAO layers to read/write the way you've done here @nickwallen . I've used this approach in the past, and it helped with test coverage and simplifying the amount of extra custom test code required. It might be worth an overall discussion on how we want to architect testing other similar endpoints going forward.


---

[GitHub] metron issue #1247: METRON-1845 Correct Test Data Load in Elasticsearch Inte...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on the issue:

    https://github.com/apache/metron/pull/1247
  
    I have merged this with master to pull in the changes from #1242.  This is ready for review.


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r236652983
  
    --- Diff: metron-platform/metron-solr/src/test/java/org/apache/metron/solr/integration/SolrUpdateIntegrationTest.java ---
    @@ -186,4 +195,114 @@ public void testHugeErrorFields() throws Exception {
         exception.expectMessage("Document contains at least one immense term in field=\"error_hash\"");
         getDao().update(errorDoc, Optional.of("error"));
       }
    +
    +  @Test
    +  @Override
    +  public void test() throws Exception {
    --- End diff --
    
    This test was previously shared in UpdateIntegrationTest between ES and Solr.  With these changes, the tests don't behave exactly the same anymore.  That being said, what I did here doesn't look right.   I'll dig into this and see what is going on.


---

[GitHub] metron issue #1247: METRON-1845 Correct Test Data Load in Elasticsearch Inte...

Posted by mmiklavc <gi...@git.apache.org>.
Github user mmiklavc commented on the issue:

    https://github.com/apache/metron/pull/1247
  
    Still +1 on latest commit - that's a nice improvement. Glad the retry policy ended up working out!


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by mmiklavc <gi...@git.apache.org>.
Github user mmiklavc commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r237202358
  
    --- Diff: metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/ElasticsearchSearchIntegrationTest.java ---
    @@ -97,45 +118,63 @@ protected static InMemoryComponent startIndex() throws Exception {
         return es;
       }
     
    -  protected static void loadTestData() throws ParseException, IOException {
    +  protected static void loadTestData() throws Exception {
         ElasticSearchComponent es = (ElasticSearchComponent) indexComponent;
     
    +    // define the bro index template
    +    String broIndex = "bro_index_2017.01.01.01";
         JSONObject broTemplate = JSONUtils.INSTANCE.load(new File(broTemplatePath), JSONObject.class);
         addTestFieldMappings(broTemplate, "bro_doc");
    -    es.getClient().admin().indices().prepareCreate("bro_index_2017.01.01.01")
    -        .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +    es.getClient().admin().indices().prepareCreate(broIndex)
    +            .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +
    +    // define the snort index template
    +    String snortIndex = "snort_index_2017.01.01.02";
         JSONObject snortTemplate = JSONUtils.INSTANCE.load(new File(snortTemplatePath), JSONObject.class);
         addTestFieldMappings(snortTemplate, "snort_doc");
    -    es.getClient().admin().indices().prepareCreate("snort_index_2017.01.01.02")
    -        .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    -
    -    BulkRequestBuilder bulkRequest = es.getClient().prepareBulk()
    -        .setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);
    -    JSONArray broArray = (JSONArray) new JSONParser().parse(broData);
    -    for (Object o : broArray) {
    -      JSONObject jsonObject = (JSONObject) o;
    -      IndexRequestBuilder indexRequestBuilder = es.getClient()
    -          .prepareIndex("bro_index_2017.01.01.01", "bro_doc");
    -      indexRequestBuilder = indexRequestBuilder.setId((String) jsonObject.get("guid"));
    -      indexRequestBuilder = indexRequestBuilder.setSource(jsonObject.toJSONString());
    -      indexRequestBuilder = indexRequestBuilder
    -          .setTimestamp(jsonObject.get("timestamp").toString());
    -      bulkRequest.add(indexRequestBuilder);
    +    es.getClient().admin().indices().prepareCreate(snortIndex)
    +            .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    +
    +    // setup the classes required to write the test data
    +    AccessConfig accessConfig = createAccessConfig();
    +    ElasticsearchClient client = ElasticsearchUtils.getClient(createGlobalConfig());
    +    ElasticsearchRetrieveLatestDao retrieveLatestDao = new ElasticsearchRetrieveLatestDao(client);
    +    ElasticsearchColumnMetadataDao columnMetadataDao = new ElasticsearchColumnMetadataDao(client);
    +    ElasticsearchRequestSubmitter requestSubmitter = new ElasticsearchRequestSubmitter(client);
    +    ElasticsearchUpdateDao updateDao = new ElasticsearchUpdateDao(client, accessConfig, retrieveLatestDao);
    +    ElasticsearchSearchDao searchDao = new ElasticsearchSearchDao(client, accessConfig, columnMetadataDao, requestSubmitter);
    +
    +    // write the test documents for Bro
    +    List<String> broDocuments = new ArrayList<>();
    +    for (Object broObject: (JSONArray) new JSONParser().parse(broData)) {
    +      broDocuments.add(((JSONObject) broObject).toJSONString());
         }
    -    JSONArray snortArray = (JSONArray) new JSONParser().parse(snortData);
    -    for (Object o : snortArray) {
    -      JSONObject jsonObject = (JSONObject) o;
    -      IndexRequestBuilder indexRequestBuilder = es.getClient()
    -          .prepareIndex("snort_index_2017.01.01.02", "snort_doc");
    -      indexRequestBuilder = indexRequestBuilder.setId((String) jsonObject.get("guid"));
    -      indexRequestBuilder = indexRequestBuilder.setSource(jsonObject.toJSONString());
    -      indexRequestBuilder = indexRequestBuilder
    -          .setTimestamp(jsonObject.get("timestamp").toString());
    -      bulkRequest.add(indexRequestBuilder);
    +    es.add(updateDao, broIndex, "bro", broDocuments);
    +
    +    // write the test documents for Snort
    +    List<String> snortDocuments = new ArrayList<>();
    +    for (Object snortObject: (JSONArray) new JSONParser().parse(snortData)) {
    +      snortDocuments.add(((JSONObject) snortObject).toJSONString());
         }
    -    BulkResponse bulkResponse = bulkRequest.execute().actionGet();
    -    if (bulkResponse.hasFailures()) {
    -      throw new RuntimeException("Failed to index test data");
    +    es.add(updateDao, snortIndex, "snort", snortDocuments);
    +
    +    // wait until the test documents are visible
    +    assertEventually(() -> Assert.assertEquals(10, findAll(searchDao).getTotal()));
    --- End diff --
    
    I meant on the es client change PR, not on this PR. Yes, I had removed assertEventually calls and it consistently passed for me in the search integration test.


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r228299271
  
    --- Diff: metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/components/ElasticSearchComponent.java ---
    @@ -194,35 +215,41 @@ public Client getClient() {
         return client;
       }
     
    -  public BulkResponse add(String indexName, String sensorType, String... docs) throws IOException {
    +  public void add(UpdateDao updateDao, String indexName, String sensorType, String... docs)
    +          throws IOException, ParseException {
         List<String> d = new ArrayList<>();
         Collections.addAll(d, docs);
    -    return add(indexName, sensorType, d);
    +    add(updateDao, indexName, sensorType, d);
       }
     
    -  public BulkResponse add(String indexName, String sensorType, Iterable<String> docs)
    -      throws IOException {
    -    BulkRequestBuilder bulkRequest = getClient().prepareBulk();
    -    for (String doc : docs) {
    -      IndexRequestBuilder indexRequestBuilder = getClient()
    -          .prepareIndex(indexName, sensorType + "_doc");
    -
    -      indexRequestBuilder = indexRequestBuilder.setSource(doc);
    -      Map<String, Object> esDoc = JSONUtils.INSTANCE
    -          .load(doc, JSONUtils.MAP_SUPPLIER);
    -      indexRequestBuilder.setId((String) esDoc.get(Constants.GUID));
    -      Object ts = esDoc.get("timestamp");
    -      if (ts != null) {
    -        indexRequestBuilder = indexRequestBuilder.setTimestamp(ts.toString());
    -      }
    -      bulkRequest.add(indexRequestBuilder);
    -    }
    +  public void add(UpdateDao updateDao, String indexName, String sensorType, Iterable<String> docs)
    --- End diff --
    
    The Elasticsearch tests use this method to load test data.  Rather than using the Transport client here, we use the `UpdateDao` to load the test data.


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by mmiklavc <gi...@git.apache.org>.
Github user mmiklavc commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r237204881
  
    --- Diff: metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/components/ElasticSearchComponent.java ---
    @@ -194,35 +215,41 @@ public Client getClient() {
         return client;
       }
     
    -  public BulkResponse add(String indexName, String sensorType, String... docs) throws IOException {
    +  public void add(UpdateDao updateDao, String indexName, String sensorType, String... docs)
    +          throws IOException, ParseException {
         List<String> d = new ArrayList<>();
         Collections.addAll(d, docs);
    -    return add(indexName, sensorType, d);
    +    add(updateDao, indexName, sensorType, d);
       }
     
    -  public BulkResponse add(String indexName, String sensorType, Iterable<String> docs)
    -      throws IOException {
    -    BulkRequestBuilder bulkRequest = getClient().prepareBulk();
    -    for (String doc : docs) {
    -      IndexRequestBuilder indexRequestBuilder = getClient()
    -          .prepareIndex(indexName, sensorType + "_doc");
    -
    -      indexRequestBuilder = indexRequestBuilder.setSource(doc);
    -      Map<String, Object> esDoc = JSONUtils.INSTANCE
    -          .load(doc, JSONUtils.MAP_SUPPLIER);
    -      indexRequestBuilder.setId((String) esDoc.get(Constants.GUID));
    -      Object ts = esDoc.get("timestamp");
    -      if (ts != null) {
    -        indexRequestBuilder = indexRequestBuilder.setTimestamp(ts.toString());
    -      }
    -      bulkRequest.add(indexRequestBuilder);
    -    }
    +  public void add(UpdateDao updateDao, String indexName, String sensorType, Iterable<String> docs)
    --- End diff --
    
    The IndexDao handles all of the plumbing for the init that you're duplicating in this test class.


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r236461564
  
    --- Diff: metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/ElasticsearchSearchIntegrationTest.java ---
    @@ -97,45 +118,63 @@ protected static InMemoryComponent startIndex() throws Exception {
         return es;
       }
     
    -  protected static void loadTestData() throws ParseException, IOException {
    +  protected static void loadTestData() throws Exception {
         ElasticSearchComponent es = (ElasticSearchComponent) indexComponent;
     
    +    // define the bro index template
    +    String broIndex = "bro_index_2017.01.01.01";
         JSONObject broTemplate = JSONUtils.INSTANCE.load(new File(broTemplatePath), JSONObject.class);
         addTestFieldMappings(broTemplate, "bro_doc");
    -    es.getClient().admin().indices().prepareCreate("bro_index_2017.01.01.01")
    -        .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +    es.getClient().admin().indices().prepareCreate(broIndex)
    +            .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +
    +    // define the snort index template
    +    String snortIndex = "snort_index_2017.01.01.02";
         JSONObject snortTemplate = JSONUtils.INSTANCE.load(new File(snortTemplatePath), JSONObject.class);
         addTestFieldMappings(snortTemplate, "snort_doc");
    -    es.getClient().admin().indices().prepareCreate("snort_index_2017.01.01.02")
    -        .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    -
    -    BulkRequestBuilder bulkRequest = es.getClient().prepareBulk()
    -        .setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);
    -    JSONArray broArray = (JSONArray) new JSONParser().parse(broData);
    -    for (Object o : broArray) {
    -      JSONObject jsonObject = (JSONObject) o;
    -      IndexRequestBuilder indexRequestBuilder = es.getClient()
    -          .prepareIndex("bro_index_2017.01.01.01", "bro_doc");
    -      indexRequestBuilder = indexRequestBuilder.setId((String) jsonObject.get("guid"));
    -      indexRequestBuilder = indexRequestBuilder.setSource(jsonObject.toJSONString());
    -      indexRequestBuilder = indexRequestBuilder
    -          .setTimestamp(jsonObject.get("timestamp").toString());
    -      bulkRequest.add(indexRequestBuilder);
    +    es.getClient().admin().indices().prepareCreate(snortIndex)
    +            .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    +
    +    // setup the classes required to write the test data
    +    AccessConfig accessConfig = createAccessConfig();
    +    ElasticsearchClient client = ElasticsearchUtils.getClient(createGlobalConfig());
    +    ElasticsearchRetrieveLatestDao retrieveLatestDao = new ElasticsearchRetrieveLatestDao(client);
    +    ElasticsearchColumnMetadataDao columnMetadataDao = new ElasticsearchColumnMetadataDao(client);
    +    ElasticsearchRequestSubmitter requestSubmitter = new ElasticsearchRequestSubmitter(client);
    +    ElasticsearchUpdateDao updateDao = new ElasticsearchUpdateDao(client, accessConfig, retrieveLatestDao);
    +    ElasticsearchSearchDao searchDao = new ElasticsearchSearchDao(client, accessConfig, columnMetadataDao, requestSubmitter);
    +
    +    // write the test documents for Bro
    +    List<String> broDocuments = new ArrayList<>();
    +    for (Object broObject: (JSONArray) new JSONParser().parse(broData)) {
    +      broDocuments.add(((JSONObject) broObject).toJSONString());
         }
    -    JSONArray snortArray = (JSONArray) new JSONParser().parse(snortData);
    -    for (Object o : snortArray) {
    -      JSONObject jsonObject = (JSONObject) o;
    -      IndexRequestBuilder indexRequestBuilder = es.getClient()
    -          .prepareIndex("snort_index_2017.01.01.02", "snort_doc");
    -      indexRequestBuilder = indexRequestBuilder.setId((String) jsonObject.get("guid"));
    -      indexRequestBuilder = indexRequestBuilder.setSource(jsonObject.toJSONString());
    -      indexRequestBuilder = indexRequestBuilder
    -          .setTimestamp(jsonObject.get("timestamp").toString());
    -      bulkRequest.add(indexRequestBuilder);
    +    es.add(updateDao, broIndex, "bro", broDocuments);
    +
    +    // write the test documents for Snort
    +    List<String> snortDocuments = new ArrayList<>();
    +    for (Object snortObject: (JSONArray) new JSONParser().parse(snortData)) {
    +      snortDocuments.add(((JSONObject) snortObject).toJSONString());
         }
    -    BulkResponse bulkResponse = bulkRequest.execute().actionGet();
    -    if (bulkResponse.hasFailures()) {
    -      throw new RuntimeException("Failed to index test data");
    +    es.add(updateDao, snortIndex, "snort", snortDocuments);
    +
    +    // wait until the test documents are visible
    +    assertEventually(() -> Assert.assertEquals(10, findAll(searchDao).getTotal()));
    --- End diff --
    
    I have tried to do exactly this on a private branch that is closer to the code base in #1269.  I made the integration tests supply a RefreshPolicy that was supposed to hold them up until the loaded data is discoverable/searchable.  I was never able to get that to work consistently though despite what the ES docs say.  I still needed the `assertEventually` for the integration tests to pass consistently.



---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by mmiklavc <gi...@git.apache.org>.
Github user mmiklavc commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r236521617
  
    --- Diff: metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/components/ElasticSearchComponent.java ---
    @@ -194,35 +215,41 @@ public Client getClient() {
         return client;
       }
     
    -  public BulkResponse add(String indexName, String sensorType, String... docs) throws IOException {
    +  public void add(UpdateDao updateDao, String indexName, String sensorType, String... docs)
    +          throws IOException, ParseException {
         List<String> d = new ArrayList<>();
         Collections.addAll(d, docs);
    -    return add(indexName, sensorType, d);
    +    add(updateDao, indexName, sensorType, d);
       }
     
    -  public BulkResponse add(String indexName, String sensorType, Iterable<String> docs)
    -      throws IOException {
    -    BulkRequestBuilder bulkRequest = getClient().prepareBulk();
    -    for (String doc : docs) {
    -      IndexRequestBuilder indexRequestBuilder = getClient()
    -          .prepareIndex(indexName, sensorType + "_doc");
    -
    -      indexRequestBuilder = indexRequestBuilder.setSource(doc);
    -      Map<String, Object> esDoc = JSONUtils.INSTANCE
    -          .load(doc, JSONUtils.MAP_SUPPLIER);
    -      indexRequestBuilder.setId((String) esDoc.get(Constants.GUID));
    -      Object ts = esDoc.get("timestamp");
    -      if (ts != null) {
    -        indexRequestBuilder = indexRequestBuilder.setTimestamp(ts.toString());
    -      }
    -      bulkRequest.add(indexRequestBuilder);
    -    }
    +  public void add(UpdateDao updateDao, String indexName, String sensorType, Iterable<String> docs)
    --- End diff --
    
    Might it be better to just use IndexDao, which `extends UpdateDao, SearchDao, RetrieveLatestDao, ColumnMetadataDao`? To that end, if we're looking to route all of this through the ES component in that fashion, it might make sense to simply replace the internal `private Client client;` and instead use the new desired IndexDao for the proxied calls to ES. It could be setup at construction time of the component and remove the need to do the same thing for every test that uses the component class, unless they actually want to do something custom and pass in their own dao during init.


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by mmiklavc <gi...@git.apache.org>.
Github user mmiklavc commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r236520379
  
    --- Diff: metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/ElasticsearchSearchIntegrationTest.java ---
    @@ -97,48 +118,81 @@ protected static InMemoryComponent startIndex() throws Exception {
         return es;
       }
     
    -  protected static void loadTestData() throws ParseException, IOException {
    +  protected static void loadTestData() throws Exception {
         ElasticSearchComponent es = (ElasticSearchComponent) indexComponent;
     
    +    // define the bro index template
    +    String broIndex = "bro_index_2017.01.01.01";
         JSONObject broTemplate = JSONUtils.INSTANCE.load(new File(broTemplatePath), JSONObject.class);
         addTestFieldMappings(broTemplate, "bro_doc");
    -    es.getClient().admin().indices().prepareCreate("bro_index_2017.01.01.01")
    -        .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +    es.getClient().admin().indices().prepareCreate(broIndex)
    +            .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +
    +    // define the snort index template
    +    String snortIndex = "snort_index_2017.01.01.02";
         JSONObject snortTemplate = JSONUtils.INSTANCE.load(new File(snortTemplatePath), JSONObject.class);
         addTestFieldMappings(snortTemplate, "snort_doc");
    -    es.getClient().admin().indices().prepareCreate("snort_index_2017.01.01.02")
    -        .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    -
    -    BulkRequestBuilder bulkRequest = es.getClient().prepareBulk()
    -        .setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);
    -    JSONArray broArray = (JSONArray) new JSONParser().parse(broData);
    -    for (Object o : broArray) {
    -      JSONObject jsonObject = (JSONObject) o;
    -      IndexRequestBuilder indexRequestBuilder = es.getClient()
    -          .prepareIndex("bro_index_2017.01.01.01", "bro_doc");
    -      indexRequestBuilder = indexRequestBuilder.setId((String) jsonObject.get("guid"));
    -      indexRequestBuilder = indexRequestBuilder.setSource(jsonObject.toJSONString());
    -      indexRequestBuilder = indexRequestBuilder
    -          .setTimestamp(jsonObject.get("timestamp").toString());
    -      bulkRequest.add(indexRequestBuilder);
    +    es.getClient().admin().indices().prepareCreate(snortIndex)
    +            .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    +
    +    // setup the classes required to write the test data
    +    AccessConfig accessConfig = createAccessConfig();
    +    ElasticsearchClient client = ElasticsearchUtils.getClient(createGlobalConfig());
    +    ElasticsearchRetrieveLatestDao retrieveLatestDao = new ElasticsearchRetrieveLatestDao(client);
    +    ElasticsearchColumnMetadataDao columnMetadataDao = new ElasticsearchColumnMetadataDao(client);
    +    ElasticsearchRequestSubmitter requestSubmitter = new ElasticsearchRequestSubmitter(client);
    +    ElasticsearchUpdateDao updateDao = new ElasticsearchUpdateDao(client, accessConfig, retrieveLatestDao);
    +    ElasticsearchSearchDao searchDao = new ElasticsearchSearchDao(client, accessConfig, columnMetadataDao, requestSubmitter);
    --- End diff --
    
    Oh wow, do we not have a master factory for stitching these all together in the desired default manner? Not sure if this helps, but how bout this -> https://github.com/apache/metron/blob/master/metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/dao/ElasticsearchDao.java#L97


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by mmiklavc <gi...@git.apache.org>.
Github user mmiklavc commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r236521989
  
    --- Diff: metron-platform/metron-solr/src/test/java/org/apache/metron/solr/integration/SolrUpdateIntegrationTest.java ---
    @@ -186,4 +195,114 @@ public void testHugeErrorFields() throws Exception {
         exception.expectMessage("Document contains at least one immense term in field=\"error_hash\"");
         getDao().update(errorDoc, Optional.of("error"));
       }
    +
    +  @Test
    +  @Override
    +  public void test() throws Exception {
    --- End diff --
    
    What do these changes have to do with ES DAO read/write approach change? This class is testing Solr - should this be in a separate PR?


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by mmiklavc <gi...@git.apache.org>.
Github user mmiklavc commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r236440028
  
    --- Diff: metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/ElasticsearchSearchIntegrationTest.java ---
    @@ -97,45 +118,63 @@ protected static InMemoryComponent startIndex() throws Exception {
         return es;
       }
     
    -  protected static void loadTestData() throws ParseException, IOException {
    +  protected static void loadTestData() throws Exception {
         ElasticSearchComponent es = (ElasticSearchComponent) indexComponent;
     
    +    // define the bro index template
    +    String broIndex = "bro_index_2017.01.01.01";
         JSONObject broTemplate = JSONUtils.INSTANCE.load(new File(broTemplatePath), JSONObject.class);
         addTestFieldMappings(broTemplate, "bro_doc");
    -    es.getClient().admin().indices().prepareCreate("bro_index_2017.01.01.01")
    -        .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +    es.getClient().admin().indices().prepareCreate(broIndex)
    +            .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +
    +    // define the snort index template
    +    String snortIndex = "snort_index_2017.01.01.02";
         JSONObject snortTemplate = JSONUtils.INSTANCE.load(new File(snortTemplatePath), JSONObject.class);
         addTestFieldMappings(snortTemplate, "snort_doc");
    -    es.getClient().admin().indices().prepareCreate("snort_index_2017.01.01.02")
    -        .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    -
    -    BulkRequestBuilder bulkRequest = es.getClient().prepareBulk()
    -        .setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);
    -    JSONArray broArray = (JSONArray) new JSONParser().parse(broData);
    -    for (Object o : broArray) {
    -      JSONObject jsonObject = (JSONObject) o;
    -      IndexRequestBuilder indexRequestBuilder = es.getClient()
    -          .prepareIndex("bro_index_2017.01.01.01", "bro_doc");
    -      indexRequestBuilder = indexRequestBuilder.setId((String) jsonObject.get("guid"));
    -      indexRequestBuilder = indexRequestBuilder.setSource(jsonObject.toJSONString());
    -      indexRequestBuilder = indexRequestBuilder
    -          .setTimestamp(jsonObject.get("timestamp").toString());
    -      bulkRequest.add(indexRequestBuilder);
    +    es.getClient().admin().indices().prepareCreate(snortIndex)
    +            .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    +
    +    // setup the classes required to write the test data
    +    AccessConfig accessConfig = createAccessConfig();
    +    ElasticsearchClient client = ElasticsearchUtils.getClient(createGlobalConfig());
    +    ElasticsearchRetrieveLatestDao retrieveLatestDao = new ElasticsearchRetrieveLatestDao(client);
    +    ElasticsearchColumnMetadataDao columnMetadataDao = new ElasticsearchColumnMetadataDao(client);
    +    ElasticsearchRequestSubmitter requestSubmitter = new ElasticsearchRequestSubmitter(client);
    +    ElasticsearchUpdateDao updateDao = new ElasticsearchUpdateDao(client, accessConfig, retrieveLatestDao);
    +    ElasticsearchSearchDao searchDao = new ElasticsearchSearchDao(client, accessConfig, columnMetadataDao, requestSubmitter);
    +
    +    // write the test documents for Bro
    +    List<String> broDocuments = new ArrayList<>();
    +    for (Object broObject: (JSONArray) new JSONParser().parse(broData)) {
    +      broDocuments.add(((JSONObject) broObject).toJSONString());
         }
    -    JSONArray snortArray = (JSONArray) new JSONParser().parse(snortData);
    -    for (Object o : snortArray) {
    -      JSONObject jsonObject = (JSONObject) o;
    -      IndexRequestBuilder indexRequestBuilder = es.getClient()
    -          .prepareIndex("snort_index_2017.01.01.02", "snort_doc");
    -      indexRequestBuilder = indexRequestBuilder.setId((String) jsonObject.get("guid"));
    -      indexRequestBuilder = indexRequestBuilder.setSource(jsonObject.toJSONString());
    -      indexRequestBuilder = indexRequestBuilder
    -          .setTimestamp(jsonObject.get("timestamp").toString());
    -      bulkRequest.add(indexRequestBuilder);
    +    es.add(updateDao, broIndex, "bro", broDocuments);
    +
    +    // write the test documents for Snort
    +    List<String> snortDocuments = new ArrayList<>();
    +    for (Object snortObject: (JSONArray) new JSONParser().parse(snortData)) {
    +      snortDocuments.add(((JSONObject) snortObject).toJSONString());
         }
    -    BulkResponse bulkResponse = bulkRequest.execute().actionGet();
    -    if (bulkResponse.hasFailures()) {
    -      throw new RuntimeException("Failed to index test data");
    +    es.add(updateDao, snortIndex, "snort", snortDocuments);
    +
    +    // wait until the test documents are visible
    +    assertEventually(() -> Assert.assertEquals(10, findAll(searchDao).getTotal()));
    --- End diff --
    
    @nickwallen is it possible to parameterize the call to do something similar to this? https://github.com/apache/metron/pull/1242/files#diff-99b67fd77500d6232462dc7c27ecff67R148
    
    It's possible this is only available on batch calls, but I wanted to mention it in case we can get away with the assertEventually calls.


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by mmiklavc <gi...@git.apache.org>.
Github user mmiklavc commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r236519370
  
    --- Diff: metron-platform/metron-solr/src/test/java/org/apache/metron/solr/integration/SolrUpdateIntegrationTest.java ---
    @@ -186,4 +195,114 @@ public void testHugeErrorFields() throws Exception {
         exception.expectMessage("Document contains at least one immense term in field=\"error_hash\"");
         getDao().update(errorDoc, Optional.of("error"));
       }
    +
    +  @Test
    +  @Override
    +  public void test() throws Exception {
    +    List<Map<String, Object>> inputData = new ArrayList<>();
    +    for(int i = 0; i < 10;++i) {
    +      final String name = "message" + i;
    +      inputData.add(
    +              new HashMap<String, Object>() {{
    +                put("source.type", SENSOR_NAME);
    +                put("name" , name);
    +                put("timestamp", System.currentTimeMillis());
    +                put(Constants.GUID, name);
    +              }}
    +      );
    +    }
    +    addTestData(getIndexName(), SENSOR_NAME, inputData);
    +    List<Map<String,Object>> docs = null;
    +    for(int t = 0;t < MAX_RETRIES;++t, Thread.sleep(SLEEP_MS)) {
    +      docs = getIndexedTestData(getIndexName(), SENSOR_NAME);
    +      if(docs.size() >= 10) {
    +        break;
    +      }
    +    }
    +    Assert.assertEquals(10, docs.size());
    +    //modify the first message and add a new field
    +    {
    +      Map<String, Object> message0 = new HashMap<String, Object>(inputData.get(0)) {{
    +        put("new-field", "metron");
    +      }};
    +      String guid = "" + message0.get(Constants.GUID);
    +      Document update = getDao().replace(new ReplaceRequest(){{
    +        setReplacement(message0);
    +        setGuid(guid);
    +        setSensorType(SENSOR_NAME);
    +        setIndex(getIndexName());
    +      }}, Optional.empty());
    +
    +      Assert.assertEquals(message0, update.getDocument());
    +      Assert.assertEquals(1, getMockHTable().size());
    +      findUpdatedDoc(message0, guid, SENSOR_NAME);
    +      {
    +        //ensure hbase is up to date
    +        Get g = new Get(HBaseDao.Key.toBytes(new HBaseDao.Key(guid, SENSOR_NAME)));
    +        Result r = getMockHTable().get(g);
    +        NavigableMap<byte[], byte[]> columns = r.getFamilyMap(CF.getBytes());
    +        Assert.assertEquals(1, columns.size());
    +        Assert.assertEquals(message0
    +                , JSONUtils.INSTANCE.load(new String(columns.lastEntry().getValue())
    +                        , JSONUtils.MAP_SUPPLIER)
    +        );
    +      }
    +      {
    +        //ensure ES is up-to-date
    --- End diff --
    
    Isn't this a Solr test?


---

[GitHub] metron issue #1247: METRON-1845 Correct Test Data Load in Elasticsearch Inte...

Posted by mmiklavc <gi...@git.apache.org>.
Github user mmiklavc commented on the issue:

    https://github.com/apache/metron/pull/1247
  
    lgtm @nickwallen, +1 pending Travis


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r228301494
  
    --- Diff: metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/ElasticsearchSearchIntegrationTest.java ---
    @@ -97,45 +118,63 @@ protected static InMemoryComponent startIndex() throws Exception {
         return es;
       }
     
    -  protected static void loadTestData() throws ParseException, IOException {
    +  protected static void loadTestData() throws Exception {
         ElasticSearchComponent es = (ElasticSearchComponent) indexComponent;
     
    +    // define the bro index template
    +    String broIndex = "bro_index_2017.01.01.01";
         JSONObject broTemplate = JSONUtils.INSTANCE.load(new File(broTemplatePath), JSONObject.class);
         addTestFieldMappings(broTemplate, "bro_doc");
    -    es.getClient().admin().indices().prepareCreate("bro_index_2017.01.01.01")
    -        .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +    es.getClient().admin().indices().prepareCreate(broIndex)
    +            .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +
    +    // define the snort index template
    +    String snortIndex = "snort_index_2017.01.01.02";
         JSONObject snortTemplate = JSONUtils.INSTANCE.load(new File(snortTemplatePath), JSONObject.class);
         addTestFieldMappings(snortTemplate, "snort_doc");
    -    es.getClient().admin().indices().prepareCreate("snort_index_2017.01.01.02")
    -        .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    -
    -    BulkRequestBuilder bulkRequest = es.getClient().prepareBulk()
    -        .setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);
    -    JSONArray broArray = (JSONArray) new JSONParser().parse(broData);
    -    for (Object o : broArray) {
    -      JSONObject jsonObject = (JSONObject) o;
    -      IndexRequestBuilder indexRequestBuilder = es.getClient()
    -          .prepareIndex("bro_index_2017.01.01.01", "bro_doc");
    -      indexRequestBuilder = indexRequestBuilder.setId((String) jsonObject.get("guid"));
    -      indexRequestBuilder = indexRequestBuilder.setSource(jsonObject.toJSONString());
    -      indexRequestBuilder = indexRequestBuilder
    -          .setTimestamp(jsonObject.get("timestamp").toString());
    -      bulkRequest.add(indexRequestBuilder);
    +    es.getClient().admin().indices().prepareCreate(snortIndex)
    +            .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    +
    +    // setup the classes required to write the test data
    +    AccessConfig accessConfig = createAccessConfig();
    +    ElasticsearchClient client = ElasticsearchUtils.getClient(createGlobalConfig());
    +    ElasticsearchRetrieveLatestDao retrieveLatestDao = new ElasticsearchRetrieveLatestDao(client);
    +    ElasticsearchColumnMetadataDao columnMetadataDao = new ElasticsearchColumnMetadataDao(client);
    +    ElasticsearchRequestSubmitter requestSubmitter = new ElasticsearchRequestSubmitter(client);
    +    ElasticsearchUpdateDao updateDao = new ElasticsearchUpdateDao(client, accessConfig, retrieveLatestDao);
    +    ElasticsearchSearchDao searchDao = new ElasticsearchSearchDao(client, accessConfig, columnMetadataDao, requestSubmitter);
    +
    +    // write the test documents for Bro
    +    List<String> broDocuments = new ArrayList<>();
    +    for (Object broObject: (JSONArray) new JSONParser().parse(broData)) {
    +      broDocuments.add(((JSONObject) broObject).toJSONString());
         }
    -    JSONArray snortArray = (JSONArray) new JSONParser().parse(snortData);
    -    for (Object o : snortArray) {
    -      JSONObject jsonObject = (JSONObject) o;
    -      IndexRequestBuilder indexRequestBuilder = es.getClient()
    -          .prepareIndex("snort_index_2017.01.01.02", "snort_doc");
    -      indexRequestBuilder = indexRequestBuilder.setId((String) jsonObject.get("guid"));
    -      indexRequestBuilder = indexRequestBuilder.setSource(jsonObject.toJSONString());
    -      indexRequestBuilder = indexRequestBuilder
    -          .setTimestamp(jsonObject.get("timestamp").toString());
    -      bulkRequest.add(indexRequestBuilder);
    +    es.add(updateDao, broIndex, "bro", broDocuments);
    +
    +    // write the test documents for Snort
    +    List<String> snortDocuments = new ArrayList<>();
    +    for (Object snortObject: (JSONArray) new JSONParser().parse(snortData)) {
    +      snortDocuments.add(((JSONObject) snortObject).toJSONString());
         }
    -    BulkResponse bulkResponse = bulkRequest.execute().actionGet();
    -    if (bulkResponse.hasFailures()) {
    -      throw new RuntimeException("Failed to index test data");
    +    es.add(updateDao, snortIndex, "snort", snortDocuments);
    +
    +    // wait until the test documents are visible
    +    assertEventually(() -> Assert.assertEquals(10, findAll(searchDao).getTotal()));
    --- End diff --
    
    Unless I added this to hold up the test and wait for the tests data to be 'visible', some of the tests would reliably fail because the test data wasn't loaded. 


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by mmiklavc <gi...@git.apache.org>.
Github user mmiklavc commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r236524985
  
    --- Diff: metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/ElasticsearchSearchIntegrationTest.java ---
    @@ -97,45 +118,63 @@ protected static InMemoryComponent startIndex() throws Exception {
         return es;
       }
     
    -  protected static void loadTestData() throws ParseException, IOException {
    +  protected static void loadTestData() throws Exception {
         ElasticSearchComponent es = (ElasticSearchComponent) indexComponent;
     
    +    // define the bro index template
    +    String broIndex = "bro_index_2017.01.01.01";
         JSONObject broTemplate = JSONUtils.INSTANCE.load(new File(broTemplatePath), JSONObject.class);
         addTestFieldMappings(broTemplate, "bro_doc");
    -    es.getClient().admin().indices().prepareCreate("bro_index_2017.01.01.01")
    -        .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +    es.getClient().admin().indices().prepareCreate(broIndex)
    +            .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +
    +    // define the snort index template
    +    String snortIndex = "snort_index_2017.01.01.02";
         JSONObject snortTemplate = JSONUtils.INSTANCE.load(new File(snortTemplatePath), JSONObject.class);
         addTestFieldMappings(snortTemplate, "snort_doc");
    -    es.getClient().admin().indices().prepareCreate("snort_index_2017.01.01.02")
    -        .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    -
    -    BulkRequestBuilder bulkRequest = es.getClient().prepareBulk()
    -        .setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);
    -    JSONArray broArray = (JSONArray) new JSONParser().parse(broData);
    -    for (Object o : broArray) {
    -      JSONObject jsonObject = (JSONObject) o;
    -      IndexRequestBuilder indexRequestBuilder = es.getClient()
    -          .prepareIndex("bro_index_2017.01.01.01", "bro_doc");
    -      indexRequestBuilder = indexRequestBuilder.setId((String) jsonObject.get("guid"));
    -      indexRequestBuilder = indexRequestBuilder.setSource(jsonObject.toJSONString());
    -      indexRequestBuilder = indexRequestBuilder
    -          .setTimestamp(jsonObject.get("timestamp").toString());
    -      bulkRequest.add(indexRequestBuilder);
    +    es.getClient().admin().indices().prepareCreate(snortIndex)
    +            .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    +
    +    // setup the classes required to write the test data
    +    AccessConfig accessConfig = createAccessConfig();
    +    ElasticsearchClient client = ElasticsearchUtils.getClient(createGlobalConfig());
    +    ElasticsearchRetrieveLatestDao retrieveLatestDao = new ElasticsearchRetrieveLatestDao(client);
    +    ElasticsearchColumnMetadataDao columnMetadataDao = new ElasticsearchColumnMetadataDao(client);
    +    ElasticsearchRequestSubmitter requestSubmitter = new ElasticsearchRequestSubmitter(client);
    +    ElasticsearchUpdateDao updateDao = new ElasticsearchUpdateDao(client, accessConfig, retrieveLatestDao);
    +    ElasticsearchSearchDao searchDao = new ElasticsearchSearchDao(client, accessConfig, columnMetadataDao, requestSubmitter);
    +
    +    // write the test documents for Bro
    +    List<String> broDocuments = new ArrayList<>();
    +    for (Object broObject: (JSONArray) new JSONParser().parse(broData)) {
    +      broDocuments.add(((JSONObject) broObject).toJSONString());
         }
    -    JSONArray snortArray = (JSONArray) new JSONParser().parse(snortData);
    -    for (Object o : snortArray) {
    -      JSONObject jsonObject = (JSONObject) o;
    -      IndexRequestBuilder indexRequestBuilder = es.getClient()
    -          .prepareIndex("snort_index_2017.01.01.02", "snort_doc");
    -      indexRequestBuilder = indexRequestBuilder.setId((String) jsonObject.get("guid"));
    -      indexRequestBuilder = indexRequestBuilder.setSource(jsonObject.toJSONString());
    -      indexRequestBuilder = indexRequestBuilder
    -          .setTimestamp(jsonObject.get("timestamp").toString());
    -      bulkRequest.add(indexRequestBuilder);
    +    es.add(updateDao, broIndex, "bro", broDocuments);
    +
    +    // write the test documents for Snort
    +    List<String> snortDocuments = new ArrayList<>();
    +    for (Object snortObject: (JSONArray) new JSONParser().parse(snortData)) {
    +      snortDocuments.add(((JSONObject) snortObject).toJSONString());
         }
    -    BulkResponse bulkResponse = bulkRequest.execute().actionGet();
    -    if (bulkResponse.hasFailures()) {
    -      throw new RuntimeException("Failed to index test data");
    +    es.add(updateDao, snortIndex, "snort", snortDocuments);
    +
    +    // wait until the test documents are visible
    +    assertEventually(() -> Assert.assertEquals(10, findAll(searchDao).getTotal()));
    --- End diff --
    
    Are you able to get this to fail consistently? I ran it a number of times and couldn't get it to fail. I'm wondering if there are any other interesting entries in the logs when this occurs. These tests from ES might be of help - https://github.com/elastic/elasticsearch/pull/17986/files#diff-1c9f982dbd2f9ddb2853d135884621b5R112


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by mmiklavc <gi...@git.apache.org>.
Github user mmiklavc commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r236521838
  
    --- Diff: metron-platform/metron-solr/src/test/java/org/apache/metron/solr/integration/SolrUpdateIntegrationTest.java ---
    @@ -186,4 +195,114 @@ public void testHugeErrorFields() throws Exception {
         exception.expectMessage("Document contains at least one immense term in field=\"error_hash\"");
         getDao().update(errorDoc, Optional.of("error"));
       }
    +
    +  @Test
    +  @Override
    +  public void test() throws Exception {
    +    List<Map<String, Object>> inputData = new ArrayList<>();
    +    for(int i = 0; i < 10;++i) {
    +      final String name = "message" + i;
    +      inputData.add(
    +              new HashMap<String, Object>() {{
    +                put("source.type", SENSOR_NAME);
    +                put("name" , name);
    +                put("timestamp", System.currentTimeMillis());
    +                put(Constants.GUID, name);
    +              }}
    +      );
    +    }
    +    addTestData(getIndexName(), SENSOR_NAME, inputData);
    +    List<Map<String,Object>> docs = null;
    +    for(int t = 0;t < MAX_RETRIES;++t, Thread.sleep(SLEEP_MS)) {
    +      docs = getIndexedTestData(getIndexName(), SENSOR_NAME);
    +      if(docs.size() >= 10) {
    +        break;
    +      }
    +    }
    +    Assert.assertEquals(10, docs.size());
    +    //modify the first message and add a new field
    +    {
    +      Map<String, Object> message0 = new HashMap<String, Object>(inputData.get(0)) {{
    +        put("new-field", "metron");
    +      }};
    +      String guid = "" + message0.get(Constants.GUID);
    +      Document update = getDao().replace(new ReplaceRequest(){{
    +        setReplacement(message0);
    +        setGuid(guid);
    +        setSensorType(SENSOR_NAME);
    +        setIndex(getIndexName());
    +      }}, Optional.empty());
    +
    +      Assert.assertEquals(message0, update.getDocument());
    +      Assert.assertEquals(1, getMockHTable().size());
    +      findUpdatedDoc(message0, guid, SENSOR_NAME);
    +      {
    +        //ensure hbase is up to date
    +        Get g = new Get(HBaseDao.Key.toBytes(new HBaseDao.Key(guid, SENSOR_NAME)));
    +        Result r = getMockHTable().get(g);
    +        NavigableMap<byte[], byte[]> columns = r.getFamilyMap(CF.getBytes());
    +        Assert.assertEquals(1, columns.size());
    +        Assert.assertEquals(message0
    +                , JSONUtils.INSTANCE.load(new String(columns.lastEntry().getValue())
    +                        , JSONUtils.MAP_SUPPLIER)
    +        );
    +      }
    +      {
    +        //ensure ES is up-to-date
    +        long cnt = 0;
    +        for (int t = 0; t < MAX_RETRIES && cnt == 0; ++t, Thread.sleep(SLEEP_MS)) {
    +          docs = getIndexedTestData(getIndexName(), SENSOR_NAME);
    +          cnt = docs
    +                  .stream()
    +                  .filter(d -> message0.get("new-field").equals(d.get("new-field")))
    +                  .count();
    +        }
    +        Assert.assertNotEquals("Data store is not updated!", cnt, 0);
    +      }
    +    }
    +    //modify the same message and modify the new field
    +    {
    +      Map<String, Object> message0 = new HashMap<String, Object>(inputData.get(0)) {{
    +        put("new-field", "metron2");
    +      }};
    +      String guid = "" + message0.get(Constants.GUID);
    +      Document update = getDao().replace(new ReplaceRequest(){{
    +        setReplacement(message0);
    +        setGuid(guid);
    +        setSensorType(SENSOR_NAME);
    +        setIndex(getIndexName());
    +      }}, Optional.empty());
    +      Assert.assertEquals(message0, update.getDocument());
    +      Assert.assertEquals(1, getMockHTable().size());
    +      Document doc = getDao().getLatest(guid, SENSOR_NAME);
    +      Assert.assertEquals(message0, doc.getDocument());
    +      findUpdatedDoc(message0, guid, SENSOR_NAME);
    +      {
    +        //ensure hbase is up to date
    +        Get g = new Get(HBaseDao.Key.toBytes(new HBaseDao.Key(guid, SENSOR_NAME)));
    +        Result r = getMockHTable().get(g);
    +        NavigableMap<byte[], byte[]> columns = r.getFamilyMap(CF.getBytes());
    +        Assert.assertEquals(2, columns.size());
    +        Assert.assertEquals(message0, JSONUtils.INSTANCE.load(new String(columns.lastEntry().getValue())
    +                , JSONUtils.MAP_SUPPLIER)
    +        );
    +        Assert.assertNotEquals(message0, JSONUtils.INSTANCE.load(new String(columns.firstEntry().getValue())
    +                , JSONUtils.MAP_SUPPLIER)
    +        );
    +      }
    +      {
    +        //ensure ES is up-to-date
    --- End diff --
    
    Solr


---

[GitHub] metron pull request #1247: METRON-1845 Correct Test Data Load in Elasticsear...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1247#discussion_r228299529
  
    --- Diff: metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/integration/ElasticsearchSearchIntegrationTest.java ---
    @@ -97,48 +118,81 @@ protected static InMemoryComponent startIndex() throws Exception {
         return es;
       }
     
    -  protected static void loadTestData() throws ParseException, IOException {
    +  protected static void loadTestData() throws Exception {
         ElasticSearchComponent es = (ElasticSearchComponent) indexComponent;
     
    +    // define the bro index template
    +    String broIndex = "bro_index_2017.01.01.01";
         JSONObject broTemplate = JSONUtils.INSTANCE.load(new File(broTemplatePath), JSONObject.class);
         addTestFieldMappings(broTemplate, "bro_doc");
    -    es.getClient().admin().indices().prepareCreate("bro_index_2017.01.01.01")
    -        .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +    es.getClient().admin().indices().prepareCreate(broIndex)
    +            .addMapping("bro_doc", JSONUtils.INSTANCE.toJSON(broTemplate.get("mappings"), false)).get();
    +
    +    // define the snort index template
    +    String snortIndex = "snort_index_2017.01.01.02";
         JSONObject snortTemplate = JSONUtils.INSTANCE.load(new File(snortTemplatePath), JSONObject.class);
         addTestFieldMappings(snortTemplate, "snort_doc");
    -    es.getClient().admin().indices().prepareCreate("snort_index_2017.01.01.02")
    -        .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    -
    -    BulkRequestBuilder bulkRequest = es.getClient().prepareBulk()
    -        .setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);
    -    JSONArray broArray = (JSONArray) new JSONParser().parse(broData);
    -    for (Object o : broArray) {
    -      JSONObject jsonObject = (JSONObject) o;
    -      IndexRequestBuilder indexRequestBuilder = es.getClient()
    -          .prepareIndex("bro_index_2017.01.01.01", "bro_doc");
    -      indexRequestBuilder = indexRequestBuilder.setId((String) jsonObject.get("guid"));
    -      indexRequestBuilder = indexRequestBuilder.setSource(jsonObject.toJSONString());
    -      indexRequestBuilder = indexRequestBuilder
    -          .setTimestamp(jsonObject.get("timestamp").toString());
    -      bulkRequest.add(indexRequestBuilder);
    +    es.getClient().admin().indices().prepareCreate(snortIndex)
    +            .addMapping("snort_doc", JSONUtils.INSTANCE.toJSON(snortTemplate.get("mappings"), false)).get();
    +
    +    // setup the classes required to write the test data
    +    AccessConfig accessConfig = createAccessConfig();
    +    ElasticsearchClient client = ElasticsearchUtils.getClient(createGlobalConfig());
    +    ElasticsearchRetrieveLatestDao retrieveLatestDao = new ElasticsearchRetrieveLatestDao(client);
    +    ElasticsearchColumnMetadataDao columnMetadataDao = new ElasticsearchColumnMetadataDao(client);
    +    ElasticsearchRequestSubmitter requestSubmitter = new ElasticsearchRequestSubmitter(client);
    +    ElasticsearchUpdateDao updateDao = new ElasticsearchUpdateDao(client, accessConfig, retrieveLatestDao);
    +    ElasticsearchSearchDao searchDao = new ElasticsearchSearchDao(client, accessConfig, columnMetadataDao, requestSubmitter);
    +
    +    // write the test documents for Bro
    +    List<String> broDocuments = new ArrayList<>();
    +    for (Object broObject: (JSONArray) new JSONParser().parse(broData)) {
    +      broDocuments.add(((JSONObject) broObject).toJSONString());
         }
    -    JSONArray snortArray = (JSONArray) new JSONParser().parse(snortData);
    -    for (Object o : snortArray) {
    -      JSONObject jsonObject = (JSONObject) o;
    -      IndexRequestBuilder indexRequestBuilder = es.getClient()
    -          .prepareIndex("snort_index_2017.01.01.02", "snort_doc");
    -      indexRequestBuilder = indexRequestBuilder.setId((String) jsonObject.get("guid"));
    -      indexRequestBuilder = indexRequestBuilder.setSource(jsonObject.toJSONString());
    -      indexRequestBuilder = indexRequestBuilder
    -          .setTimestamp(jsonObject.get("timestamp").toString());
    -      bulkRequest.add(indexRequestBuilder);
    +    es.add(updateDao, broIndex, "bro", broDocuments);
    +
    +    // write the test documents for Snort
    +    List<String> snortDocuments = new ArrayList<>();
    +    for (Object snortObject: (JSONArray) new JSONParser().parse(snortData)) {
    +      snortDocuments.add(((JSONObject) snortObject).toJSONString());
         }
    -    BulkResponse bulkResponse = bulkRequest.execute().actionGet();
    -    if (bulkResponse.hasFailures()) {
    -      throw new RuntimeException("Failed to index test data");
    +    es.add(updateDao, snortIndex, "snort", snortDocuments);
    --- End diff --
    
    We are now using `ElasticsearchComponent.add` to load the test data here.  This change matches what already exists in the other ES integration tests.


---