You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@systemds.apache.org by GitBox <gi...@apache.org> on 2020/09/09 14:29:38 UTC

[GitHub] [systemds] sebwrede opened a new pull request #1051: Fine-Grained Privacy Constraints 2 Rebased

sebwrede opened a new pull request #1051:
URL: https://github.com/apache/systemds/pull/1051


   This PR replaces the [Fine-Grained Privacy Constraints 2 PR](https://github.com/apache/systemds/pull/985). 
   This PR adds fine-grained privacy constraints and adapts the privacy constraint handling to the new federated architecture. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] sebwrede commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
sebwrede commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486396082



##########
File path: src/main/java/org/apache/sysds/runtime/privacy/FineGrained/FineGrainedPrivacyList.java
##########
@@ -0,0 +1,122 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.sysds.runtime.privacy.FineGrained;
+
+import java.io.Serializable;
+import java.util.AbstractMap;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.Map;
+
+import org.apache.sysds.runtime.privacy.PrivacyConstraint.PrivacyLevel;
+
+/**
+ * Simple implementation of retrieving fine-grained privacy constraints
+ * based on pairs in an ArrayList.
+ */
+public class FineGrainedPrivacyList implements FineGrainedPrivacy {
+
+	private ArrayList<Map.Entry<DataRange, PrivacyLevel>> constraintCollection = new ArrayList<>();
+
+	@Override
+	public void put(DataRange dataRange, PrivacyLevel privacyLevel) {
+		constraintCollection.add(new AbstractMap.SimpleEntry<DataRange, PrivacyLevel>(dataRange, privacyLevel));
+	}
+
+	@Override
+	public Map<DataRange,PrivacyLevel> getPrivacyLevel(DataRange searchRange) {
+		Map<DataRange, PrivacyLevel> matches = new LinkedHashMap<>();
+		for ( Map.Entry<DataRange, PrivacyLevel> constraint : constraintCollection ){
+			if ( constraint.getKey().overlaps(searchRange) ) 
+				matches.put(constraint.getKey(), constraint.getValue());
+		}
+		return matches;
+	}
+
+	@Override
+	public Map<DataRange,PrivacyLevel> getPrivacyLevelOfElement(long[] searchIndex) {
+		Map<DataRange, PrivacyLevel> matches = new LinkedHashMap<>();
+		constraintCollection.forEach( constraint -> { 
+			if (constraint.getKey().contains(searchIndex)) 
+				matches.put(constraint.getKey(), constraint.getValue()); 
+		} );
+		return matches;
+	}
+
+	@Override
+	public DataRange[] getDataRangesOfPrivacyLevel(PrivacyLevel privacyLevel) {
+		ArrayList<DataRange> matches = new ArrayList<>();
+		constraintCollection.forEach(constraint -> { if (constraint.getValue() == privacyLevel) matches.add(constraint.getKey()); } );
+		return matches.toArray(new DataRange[0]);
+	}
+
+	@Override
+	public void removeAllConstraints() {
+		constraintCollection.clear();
+	}
+
+	@Override
+	public boolean hasConstraints() {
+		return !constraintCollection.isEmpty();
+	}
+
+	@Override
+	public Map<String, long[][][]> getAllConstraints() {

Review comment:
       This method is currently not in use. At some point, I thought I would use it to serialize the fine-grained privacy constraints. The reason I did not delete it yet is if I for some reason need it later. Should I delete it or keep it as it is? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] Baunsgaard commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
Baunsgaard commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486337594



##########
File path: src/main/java/org/apache/sysds/runtime/instructions/Instruction.java
##########
@@ -38,7 +41,21 @@
 		FEDERATED
 	}
 	
-	private static final Log LOG = LogFactory.getLog(Instruction.class.getName());
+	protected static final Log LOG = LogFactory.getLog(Instruction.class.getName());

Review comment:
       I made this private to ensure that the LOG is located in the same file it is called from.
   If you want to make use of it somewhere else make a new Log object in that file.

##########
File path: src/main/java/org/apache/sysds/runtime/instructions/spark/SPInstruction.java
##########
@@ -41,16 +41,15 @@
 	}
 
 	protected final SPType _sptype;
-	protected final Operator _optr;
 	protected final boolean _requiresLabelUpdate;
 
 	protected SPInstruction(SPType type, String opcode, String istr) {
 		this(type, null, opcode, istr);
 	}
 
 	protected SPInstruction(SPType type, Operator op, String opcode, String istr) {
+		super(op);

Review comment:
       can you explain to me this super class call?

##########
File path: src/test/java/org/apache/sysds/test/functions/privacy/GLMTest.java
##########
@@ -154,7 +155,10 @@ public GLMTest (int numRecords_, int numFeatures_, int distFamilyType_, double d
 			{  100,  10,  2, -1.0,  4,  0.0,  0.01, 3.0,  -2.0,  1.0,  1.0, GLMType.Bernoullicloglog1 }, // Bernoulli {-1, 1}.cloglog
 			{  200,  10,  2, -1.0,  5,  0.0,  0.01, 3.0,   0.0,  2.0,  1.0, GLMType.Bernoullicauchit },  // Bernoulli {-1, 1}.cauchit
 		};
-		return Arrays.asList(data);
+		if ( runAll )
+			return Arrays.asList(data);
+		else 
+			return Arrays.asList( new Object[][]{data[0]} );

Review comment:
       again the filter?

##########
File path: src/main/java/org/apache/sysds/runtime/instructions/Instruction.java
##########
@@ -38,7 +41,21 @@
 		FEDERATED
 	}
 	
-	private static final Log LOG = LogFactory.getLog(Instruction.class.getName());
+	protected static final Log LOG = LogFactory.getLog(Instruction.class.getName());
+	protected final Operator _optr;
+
+	protected Instruction(Operator _optr){
+		this._optr = _optr;
+	}
+
+	// local flag for debug output
+	private static final boolean LTRACE = false;
+	static {
+		// for internal debugging only
+		if( LTRACE ) {
+			Logger.getLogger("org.apache.sysds.runtime.instructions.Instruction").setLevel(Level.TRACE);

Review comment:
       we removed all the instances of this overwriting of logging.
   Please change to use the file in:
   /src/test/resources/log4j.properties
   
   if you want to debug something specific.
   

##########
File path: src/main/java/org/apache/sysds/runtime/controlprogram/federated/FederatedWorkerHandler.java
##########
@@ -280,8 +281,9 @@ private FederatedResponse execInstruction(FederatedRequest request) {
 		ExecutionContext ec = _ecm.get(request.getTID());
 		BasicProgramBlock pb = new BasicProgramBlock(null);
 		pb.getInstructions().clear();
-		pb.getInstructions().add(InstructionParser
-			.parseSingleInstruction((String)request.getParam(0)));
+		Instruction receivedInstruction = InstructionParser
+				.parseSingleInstruction((String)request.getParam(0));
+		pb.getInstructions().add(receivedInstruction);

Review comment:
       I don't see what this change does.

##########
File path: src/main/java/org/apache/sysds/runtime/privacy/FineGrained/FineGrainedPrivacyList.java
##########
@@ -0,0 +1,122 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.sysds.runtime.privacy.FineGrained;
+
+import java.io.Serializable;
+import java.util.AbstractMap;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.Map;
+
+import org.apache.sysds.runtime.privacy.PrivacyConstraint.PrivacyLevel;
+
+/**
+ * Simple implementation of retrieving fine-grained privacy constraints
+ * based on pairs in an ArrayList.
+ */
+public class FineGrainedPrivacyList implements FineGrainedPrivacy {
+
+	private ArrayList<Map.Entry<DataRange, PrivacyLevel>> constraintCollection = new ArrayList<>();
+
+	@Override
+	public void put(DataRange dataRange, PrivacyLevel privacyLevel) {
+		constraintCollection.add(new AbstractMap.SimpleEntry<DataRange, PrivacyLevel>(dataRange, privacyLevel));
+	}
+
+	@Override
+	public Map<DataRange,PrivacyLevel> getPrivacyLevel(DataRange searchRange) {
+		Map<DataRange, PrivacyLevel> matches = new LinkedHashMap<>();

Review comment:
       if possible can it not be a linkedHashMap, since the performance of that is ... slow, in many cases.

##########
File path: src/main/java/org/apache/sysds/parser/DataExpression.java
##########
@@ -2239,9 +2251,44 @@ public boolean isRead()
 	 * Sets privacy of identifier if privacy variable parameter is set.  
 	 */
 	private void setPrivacy(){
-		Expression eprivacy = getVarParam("privacy");
-		if ( eprivacy != null ){
-			getOutput().setPrivacy(PrivacyLevel.valueOf(eprivacy.toString()));
+		Expression eprivacy = getVarParam(PRIVACY);

Review comment:
       can this method be split up into multiple, or add some comments?

##########
File path: src/main/java/org/apache/sysds/runtime/privacy/PrivacyPropagator.java
##########
@@ -414,4 +591,25 @@ else if ( inst instanceof SqlCPInstruction )
 			instructionOutputNames = new String[]{((SqlCPInstruction) inst).getOutputVariableName()};
 		return instructionOutputNames;
 	}
+
+	private static ArrayList<CPOperand> getOutputOperands(Instruction inst){
+		// The order of the following statements is important
+		if ( inst instanceof MultiReturnParameterizedBuiltinCPInstruction )
+			return ((MultiReturnParameterizedBuiltinCPInstruction) inst).getOutputs();
+		else if ( inst instanceof MultiReturnBuiltinCPInstruction )
+			return ((MultiReturnBuiltinCPInstruction) inst).getOutputs();
+		else if ( inst instanceof ComputationCPInstruction )
+			return getSingletonList(((ComputationCPInstruction) inst).getOutput());
+		else if ( inst instanceof VariableCPInstruction )
+			return getSingletonList(((VariableCPInstruction) inst).getOutput());
+		else if ( inst instanceof SqlCPInstruction )
+			return getSingletonList(((SqlCPInstruction) inst).getOutput());
+		return new ArrayList<CPOperand>();

Review comment:
       not 100% sure, but the if else is instance of is probably slow, and there should be some alternative. 

##########
File path: src/main/java/org/apache/sysds/runtime/privacy/FineGrained/FineGrainedPrivacyList.java
##########
@@ -0,0 +1,122 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.sysds.runtime.privacy.FineGrained;
+
+import java.io.Serializable;
+import java.util.AbstractMap;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.Map;
+
+import org.apache.sysds.runtime.privacy.PrivacyConstraint.PrivacyLevel;
+
+/**
+ * Simple implementation of retrieving fine-grained privacy constraints
+ * based on pairs in an ArrayList.
+ */
+public class FineGrainedPrivacyList implements FineGrainedPrivacy {
+
+	private ArrayList<Map.Entry<DataRange, PrivacyLevel>> constraintCollection = new ArrayList<>();
+
+	@Override
+	public void put(DataRange dataRange, PrivacyLevel privacyLevel) {
+		constraintCollection.add(new AbstractMap.SimpleEntry<DataRange, PrivacyLevel>(dataRange, privacyLevel));
+	}
+
+	@Override
+	public Map<DataRange,PrivacyLevel> getPrivacyLevel(DataRange searchRange) {
+		Map<DataRange, PrivacyLevel> matches = new LinkedHashMap<>();
+		for ( Map.Entry<DataRange, PrivacyLevel> constraint : constraintCollection ){
+			if ( constraint.getKey().overlaps(searchRange) ) 
+				matches.put(constraint.getKey(), constraint.getValue());
+		}
+		return matches;
+	}
+
+	@Override
+	public Map<DataRange,PrivacyLevel> getPrivacyLevelOfElement(long[] searchIndex) {
+		Map<DataRange, PrivacyLevel> matches = new LinkedHashMap<>();
+		constraintCollection.forEach( constraint -> { 
+			if (constraint.getKey().contains(searchIndex)) 
+				matches.put(constraint.getKey(), constraint.getValue()); 
+		} );
+		return matches;
+	}
+
+	@Override
+	public DataRange[] getDataRangesOfPrivacyLevel(PrivacyLevel privacyLevel) {
+		ArrayList<DataRange> matches = new ArrayList<>();
+		constraintCollection.forEach(constraint -> { if (constraint.getValue() == privacyLevel) matches.add(constraint.getKey()); } );
+		return matches.toArray(new DataRange[0]);
+	}
+
+	@Override
+	public void removeAllConstraints() {
+		constraintCollection.clear();
+	}
+
+	@Override
+	public boolean hasConstraints() {
+		return !constraintCollection.isEmpty();
+	}
+
+	@Override
+	public Map<String, long[][][]> getAllConstraints() {

Review comment:
       also this array of array of array, seems extravagant. Preferably it would be nice in some other format.

##########
File path: src/main/java/org/apache/sysds/runtime/privacy/PrivacyPropagator.java
##########
@@ -19,15 +19,19 @@
 
 package org.apache.sysds.runtime.privacy;
 
+import java.util.*;

Review comment:
       no wildcard imports.

##########
File path: src/test/java/org/apache/sysds/test/functions/privacy/BuiltinGLMTest.java
##########
@@ -228,6 +229,9 @@ public void runtestGLM(PrivacyConstraint privacyConstraint, Class<?> expectedExc
 				{  100,   10,  2,  1.0,  2,  0.0,  3.0,   0.0,  2.0,  2.5 },   // Binomial two-column.logit
 				{  200,   10,  2,  1.0,  3,  0.0,  3.0,   0.0,  2.0,  2.5 },   // Binomial two-column.probit
 		};
-		return Arrays.asList(data);
+		if ( runAll )
+			return Arrays.asList(data);
+		else
+			return Arrays.asList(new Object[][]{data[0]});

Review comment:
       is it to slow to run all or?

##########
File path: src/main/java/org/apache/sysds/runtime/instructions/cp/MultiReturnBuiltinCPInstruction.java
##########
@@ -47,6 +47,10 @@ public CPOperand getOutput(int i) {
 		return _outputs.get(i);
 	}
 
+	public ArrayList<CPOperand> getOutputs(){

Review comment:
       Maybe change these to List type return?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] sebwrede commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
sebwrede commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486439575



##########
File path: src/main/java/org/apache/sysds/parser/DataExpression.java
##########
@@ -2239,9 +2251,44 @@ public boolean isRead()
 	 * Sets privacy of identifier if privacy variable parameter is set.  
 	 */
 	private void setPrivacy(){
-		Expression eprivacy = getVarParam("privacy");
-		if ( eprivacy != null ){
-			getOutput().setPrivacy(PrivacyLevel.valueOf(eprivacy.toString()));
+		Expression eprivacy = getVarParam(PRIVACY);

Review comment:
       Is it easier to read in the current version, @Baunsgaard ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] sebwrede commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
sebwrede commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486379692



##########
File path: src/main/java/org/apache/sysds/runtime/controlprogram/federated/FederatedWorkerHandler.java
##########
@@ -280,8 +281,9 @@ private FederatedResponse execInstruction(FederatedRequest request) {
 		ExecutionContext ec = _ecm.get(request.getTID());
 		BasicProgramBlock pb = new BasicProgramBlock(null);
 		pb.getInstructions().clear();
-		pb.getInstructions().add(InstructionParser
-			.parseSingleInstruction((String)request.getParam(0)));
+		Instruction receivedInstruction = InstructionParser
+				.parseSingleInstruction((String)request.getParam(0));
+		pb.getInstructions().add(receivedInstruction);

Review comment:
       It does not do anything differently. I thought at some point that I would need to handle the receivedInstruction, but then I decided against it. I could change it back like it was before and the code would work the same way, so right now it is just a question of readability: is the new version easier to read for you, @Baunsgaard ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] sebwrede commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
sebwrede commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486402441



##########
File path: src/test/java/org/apache/sysds/test/functions/privacy/GLMTest.java
##########
@@ -154,7 +155,10 @@ public GLMTest (int numRecords_, int numFeatures_, int distFamilyType_, double d
 			{  100,  10,  2, -1.0,  4,  0.0,  0.01, 3.0,  -2.0,  1.0,  1.0, GLMType.Bernoullicloglog1 }, // Bernoulli {-1, 1}.cloglog
 			{  200,  10,  2, -1.0,  5,  0.0,  0.01, 3.0,   0.0,  2.0,  1.0, GLMType.Bernoullicauchit },  // Bernoulli {-1, 1}.cauchit
 		};
-		return Arrays.asList(data);
+		if ( runAll )
+			return Arrays.asList(data);
+		else 
+			return Arrays.asList( new Object[][]{data[0]} );

Review comment:
       Indeed :smiley: 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] sebwrede commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
sebwrede commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486373173



##########
File path: src/main/java/org/apache/sysds/runtime/instructions/Instruction.java
##########
@@ -38,7 +41,21 @@
 		FEDERATED
 	}
 	
-	private static final Log LOG = LogFactory.getLog(Instruction.class.getName());
+	protected static final Log LOG = LogFactory.getLog(Instruction.class.getName());

Review comment:
       Private is also good. I do not need the "protected" to be there. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] sebwrede commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
sebwrede commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486385202



##########
File path: src/main/java/org/apache/sysds/runtime/instructions/spark/SPInstruction.java
##########
@@ -41,16 +41,15 @@
 	}
 
 	protected final SPType _sptype;
-	protected final Operator _optr;
 	protected final boolean _requiresLabelUpdate;
 
 	protected SPInstruction(SPType type, String opcode, String istr) {
 		this(type, null, opcode, istr);
 	}
 
 	protected SPInstruction(SPType type, Operator op, String opcode, String istr) {
+		super(op);

Review comment:
       It calls the super class constructor, which sets the Operator field (which is in the super class). 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] sebwrede commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
sebwrede commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486393369



##########
File path: src/main/java/org/apache/sysds/runtime/privacy/FineGrained/FineGrainedPrivacyList.java
##########
@@ -0,0 +1,122 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.sysds.runtime.privacy.FineGrained;
+
+import java.io.Serializable;
+import java.util.AbstractMap;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.Map;
+
+import org.apache.sysds.runtime.privacy.PrivacyConstraint.PrivacyLevel;
+
+/**
+ * Simple implementation of retrieving fine-grained privacy constraints
+ * based on pairs in an ArrayList.
+ */
+public class FineGrainedPrivacyList implements FineGrainedPrivacy {
+
+	private ArrayList<Map.Entry<DataRange, PrivacyLevel>> constraintCollection = new ArrayList<>();
+
+	@Override
+	public void put(DataRange dataRange, PrivacyLevel privacyLevel) {
+		constraintCollection.add(new AbstractMap.SimpleEntry<DataRange, PrivacyLevel>(dataRange, privacyLevel));
+	}
+
+	@Override
+	public Map<DataRange,PrivacyLevel> getPrivacyLevel(DataRange searchRange) {
+		Map<DataRange, PrivacyLevel> matches = new LinkedHashMap<>();

Review comment:
       This depends on how it is used. In this case, the "containsValue" is the most used call on the returned map. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] Baunsgaard commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
Baunsgaard commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486393950



##########
File path: src/main/java/org/apache/sysds/parser/DataExpression.java
##########
@@ -2239,9 +2251,44 @@ public boolean isRead()
 	 * Sets privacy of identifier if privacy variable parameter is set.  
 	 */
 	private void setPrivacy(){
-		Expression eprivacy = getVarParam("privacy");
-		if ( eprivacy != null ){
-			getOutput().setPrivacy(PrivacyLevel.valueOf(eprivacy.toString()));
+		Expression eprivacy = getVarParam(PRIVACY);

Review comment:
       i only ask because when i was going through all the code, this stood out, and was ...  to complicated to review.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] sebwrede commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
sebwrede commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486403288



##########
File path: src/main/java/org/apache/sysds/runtime/instructions/cp/MultiReturnBuiltinCPInstruction.java
##########
@@ -47,6 +47,10 @@ public CPOperand getOutput(int i) {
 		return _outputs.get(i);
 	}
 
+	public ArrayList<CPOperand> getOutputs(){

Review comment:
       Yes, I usually also go for that option. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] Baunsgaard commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
Baunsgaard commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486393184



##########
File path: src/main/java/org/apache/sysds/runtime/instructions/cp/MultiReturnBuiltinCPInstruction.java
##########
@@ -47,6 +47,10 @@ public CPOperand getOutput(int i) {
 		return _outputs.get(i);
 	}
 
+	public ArrayList<CPOperand> getOutputs(){

Review comment:
       well, its an opinion thing, i like to return the highest abstraction, i think we had a course this was mentioned as a point.
   Software Engineering... :+1: 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] mboehm7 commented on pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
mboehm7 commented on pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#issuecomment-691664020


   Also thanks to @Baunsgaard for the earlier review.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] sebwrede commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
sebwrede commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486439247



##########
File path: src/main/java/org/apache/sysds/runtime/instructions/cp/MultiReturnBuiltinCPInstruction.java
##########
@@ -47,6 +47,10 @@ public CPOperand getOutput(int i) {
 		return _outputs.get(i);
 	}
 
+	public ArrayList<CPOperand> getOutputs(){

Review comment:
       I have changed it now. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] sebwrede commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
sebwrede commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486383020



##########
File path: src/main/java/org/apache/sysds/runtime/instructions/cp/MultiReturnBuiltinCPInstruction.java
##########
@@ -47,6 +47,10 @@ public CPOperand getOutput(int i) {
 		return _outputs.get(i);
 	}
 
+	public ArrayList<CPOperand> getOutputs(){

Review comment:
       I also thought about doing that. It will always be an ArrayList, so in the end it would not make much of a difference. It is more an API-design thing. Do you prefer the List type? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] sebwrede commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
sebwrede commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486393369



##########
File path: src/main/java/org/apache/sysds/runtime/privacy/FineGrained/FineGrainedPrivacyList.java
##########
@@ -0,0 +1,122 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.sysds.runtime.privacy.FineGrained;
+
+import java.io.Serializable;
+import java.util.AbstractMap;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.Map;
+
+import org.apache.sysds.runtime.privacy.PrivacyConstraint.PrivacyLevel;
+
+/**
+ * Simple implementation of retrieving fine-grained privacy constraints
+ * based on pairs in an ArrayList.
+ */
+public class FineGrainedPrivacyList implements FineGrainedPrivacy {
+
+	private ArrayList<Map.Entry<DataRange, PrivacyLevel>> constraintCollection = new ArrayList<>();
+
+	@Override
+	public void put(DataRange dataRange, PrivacyLevel privacyLevel) {
+		constraintCollection.add(new AbstractMap.SimpleEntry<DataRange, PrivacyLevel>(dataRange, privacyLevel));
+	}
+
+	@Override
+	public Map<DataRange,PrivacyLevel> getPrivacyLevel(DataRange searchRange) {
+		Map<DataRange, PrivacyLevel> matches = new LinkedHashMap<>();

Review comment:
       This depends on how it is used. In this case, the "containsValue" is the most used call on the returned map. 
   Which map type would you suggest?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] mboehm7 commented on pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
mboehm7 commented on pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#issuecomment-691663906


   LGTM. Thanks for the patch @sebwrede and reconciling the tests with the new federated backend. I only made some minor changes: removed warnings, fixed few formatting issues, renamed new package to `finegrained`, and moved static methods for constraint parsing from `DataExpression` to `PrivacyUtils`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] sebwrede commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
sebwrede commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486369426



##########
File path: src/main/java/org/apache/sysds/runtime/instructions/Instruction.java
##########
@@ -38,7 +41,21 @@
 		FEDERATED
 	}
 	
-	private static final Log LOG = LogFactory.getLog(Instruction.class.getName());
+	protected static final Log LOG = LogFactory.getLog(Instruction.class.getName());
+	protected final Operator _optr;
+
+	protected Instruction(Operator _optr){
+		this._optr = _optr;
+	}
+
+	// local flag for debug output
+	private static final boolean LTRACE = false;
+	static {
+		// for internal debugging only
+		if( LTRACE ) {
+			Logger.getLogger("org.apache.sysds.runtime.instructions.Instruction").setLevel(Level.TRACE);

Review comment:
       I am not using it, so I just removed it. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] sebwrede commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
sebwrede commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486407377



##########
File path: src/main/java/org/apache/sysds/runtime/instructions/spark/SPInstruction.java
##########
@@ -41,16 +41,15 @@
 	}
 
 	protected final SPType _sptype;
-	protected final Operator _optr;
 	protected final boolean _requiresLabelUpdate;
 
 	protected SPInstruction(SPType type, String opcode, String istr) {
 		this(type, null, opcode, istr);
 	}
 
 	protected SPInstruction(SPType type, Operator op, String opcode, String istr) {
+		super(op);

Review comment:
       Because they all have the same field, but it was written in each of the classes. I placed it into their super class (Instruction) and then decided that calling the super constructor is a good idea, since we would then be able to change the way the operator is set for all subclasses by changing the Instruction constructor. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] sebwrede commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
sebwrede commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486396715



##########
File path: src/main/java/org/apache/sysds/runtime/privacy/PrivacyPropagator.java
##########
@@ -19,15 +19,19 @@
 
 package org.apache.sysds.runtime.privacy;
 
+import java.util.*;

Review comment:
       OK




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] sebwrede commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
sebwrede commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486399379



##########
File path: src/main/java/org/apache/sysds/runtime/privacy/PrivacyPropagator.java
##########
@@ -414,4 +591,25 @@ else if ( inst instanceof SqlCPInstruction )
 			instructionOutputNames = new String[]{((SqlCPInstruction) inst).getOutputVariableName()};
 		return instructionOutputNames;
 	}
+
+	private static ArrayList<CPOperand> getOutputOperands(Instruction inst){
+		// The order of the following statements is important
+		if ( inst instanceof MultiReturnParameterizedBuiltinCPInstruction )
+			return ((MultiReturnParameterizedBuiltinCPInstruction) inst).getOutputs();
+		else if ( inst instanceof MultiReturnBuiltinCPInstruction )
+			return ((MultiReturnBuiltinCPInstruction) inst).getOutputs();
+		else if ( inst instanceof ComputationCPInstruction )
+			return getSingletonList(((ComputationCPInstruction) inst).getOutput());
+		else if ( inst instanceof VariableCPInstruction )
+			return getSingletonList(((VariableCPInstruction) inst).getOutput());
+		else if ( inst instanceof SqlCPInstruction )
+			return getSingletonList(((SqlCPInstruction) inst).getOutput());
+		return new ArrayList<CPOperand>();

Review comment:
       The alternative is to have a method in Instruction called "getOutputs" and then implement it in the subclasses. This is however not the general design we use at the moment, but it would make much of the code cleaner to look at. For instance, most of the switch statements in this class could also be avoided if we changed the design and used Java inheritance instead. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] asfgit closed pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #1051:
URL: https://github.com/apache/systemds/pull/1051


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] sebwrede commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
sebwrede commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486376414



##########
File path: src/main/java/org/apache/sysds/parser/DataExpression.java
##########
@@ -2239,9 +2251,44 @@ public boolean isRead()
 	 * Sets privacy of identifier if privacy variable parameter is set.  
 	 */
 	private void setPrivacy(){
-		Expression eprivacy = getVarParam("privacy");
-		if ( eprivacy != null ){
-			getOutput().setPrivacy(PrivacyLevel.valueOf(eprivacy.toString()));
+		Expression eprivacy = getVarParam(PRIVACY);

Review comment:
       Yes, I also considered splitting it up, but the current implementation actually fits the general style of this class :stuck_out_tongue: .
   I can change it now, if it is too confusing like this. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] Baunsgaard commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
Baunsgaard commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486391285



##########
File path: src/main/java/org/apache/sysds/runtime/instructions/spark/SPInstruction.java
##########
@@ -41,16 +41,15 @@
 	}
 
 	protected final SPType _sptype;
-	protected final Operator _optr;
 	protected final boolean _requiresLabelUpdate;
 
 	protected SPInstruction(SPType type, String opcode, String istr) {
 		this(type, null, opcode, istr);
 	}
 
 	protected SPInstruction(SPType type, Operator op, String opcode, String istr) {
+		super(op);

Review comment:
       yes, but why do you need to change it? because this change is done in CP, SP, Fed, and GPU, and i don't see why.?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] sebwrede commented on a change in pull request #1051: [SYSTEMDS-2605] Fine-Grained Privacy Constraints 2 Rebased

Posted by GitBox <gi...@apache.org>.
sebwrede commented on a change in pull request #1051:
URL: https://github.com/apache/systemds/pull/1051#discussion_r486400993



##########
File path: src/test/java/org/apache/sysds/test/functions/privacy/BuiltinGLMTest.java
##########
@@ -228,6 +229,9 @@ public void runtestGLM(PrivacyConstraint privacyConstraint, Class<?> expectedExc
 				{  100,   10,  2,  1.0,  2,  0.0,  3.0,   0.0,  2.0,  2.5 },   // Binomial two-column.logit
 				{  200,   10,  2,  1.0,  3,  0.0,  3.0,   0.0,  2.0,  2.5 },   // Binomial two-column.probit
 		};
-		return Arrays.asList(data);
+		if ( runAll )
+			return Arrays.asList(data);
+		else
+			return Arrays.asList(new Object[][]{data[0]});

Review comment:
       Yes, it is very slow. The data has now also been reduced, so it is way faster now. If we later need the different versions of GLM to test different use cases of the privacy constraints, then we can remove the above code and thereby include all test cases, but for now it is not necessary. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org