CloneDetection

Detect clones in a cross-language setting. Companion to the paper "Cross-language Clone Detection for Mobile Apps"

View the Project on GitHub FLAGlab/CloneDetection

Loop evaluation Results

back to main results

Managing loops is an interesting case for universal nodes. Loop abstractions are defined using different keywords in the programming languages (e.g., for, while). In Out of Step we process loop abstractions as semantically equivalent, and define them as Type 3 clones. To evaluate this, we manually inject Type 1 and Type 3 clones in our tests.

Snippet 1. Dart for loop - version A

 int sum = 0;
 for (int i = 1; i <= 100; i++) 
    sum = sum + i;

Snippet 2. Dart while loop - version B

 int sum = 0;
 int i = 1;
 while ( i <= 100) {
    sum = sum + i;
    i += 1;
 }

Snippet 3. Kotlin for loop - version A.

 var sum = 0
 for (i in 1..100) 
    sum = sum + i

Snippet 4. Kotlin while loop - version B

 var sum = 0
 var i = 1
 while (i <= 100) {
    sum = sum + i
    i += 1
 }

Single language evaluation

The analysis for Dart (Snippets 1 and 2), and Kotlin (Snippets 3 and 4) in the same language results in two Type 3 clones. These match the for and while statements and the i variable declaration. The clone for the postfix statement i++ and the variable assignment i += 1 is not correctly detected as a Type 3 clone due to their node representations in the eCST, an assignment_operator unary node, and a binary node respectively. Nonetheless, we identify the complete for and while blocks as clones of each other, as the algorithm broadcasts clones inside the body of blocks. Additionally, Out of Step identifies the declarations and assignment of variables as Type 1 clones in Line 1 of both snippets, and Line 3 of version A with Line 4 of version B.

Algorithm Total Type 1 Type 2 Type 3 FP FN Precision Recall
Dart A 26 26 0 0 0 0 1 1
Dart B 27 27 0 0 0 0 1 1
Dart A v. B 12 3 7 2 2 0 0.84 1
Kotlin A 17 17 0 0 0 0 1 1
Kotlin B 24 24 0 0 0 0 1 1
Kotlin A v. B 8 3 5 0 0 2 1 0.8

Note that while the Dart analysis is effective in detecting the type 3 clones between the increment of the i variable in the for loop and while loop, and the looping structures themselves, the Kotlin analysis fails to detect such Type 3 clones, and therefore has a reduced Recall. With respect to the Precision, the roles reverses. Dart detect a couple of FPs with respect to the Type 2 clones between LITERAL and TYPE nodes, while the Kotlin analysis does not detect them, reaching full precision.

Cross-language evaluation

The cross-language comparison between code versions A and B is also successful in detecting the corresponding clones. This is because the node information and the eCST structure are the same for the two code snippets. Every node is matched to its corresponding node on the other eCST, even though the programming languages differ. When comparing any A vs any B versions, we find that there are Type 3 clones for the for and while statements, and the definition of variable i. Additionally, we can find the Type 1 clones for the declaration and assignment of the sum variable, as before.

In the loop analysis, Out of Step finds a couple of Type 2 false positives, due to the two assignments that are present in versions B of the code. In this case, the detection algorithm points to the body of the for and while statements to be clones. This happens because the intermediate type node for both of them is the same. The exact same behavior takes place with the assignment before the loop statement in both cases.

Algorithm Total Type 1 Type 2 Type 3 FP FN Precision Recall
Dart v. Kotlin A 7 4 3 0 1 0 0.86 1
Dart v. Kotlin B 17 6 9 2 1 3 0.93 0.82
Dart A v. Kotlin B 9 2 6 1 1 1 0.89 0.89
Dart B v. Kotlin A 10 2 7 1 1 0 0.9 1