* ICSE 2018 *
Sun 27 May - Sun 3 June 2018 Gothenburg, Sweden
Fri 1 Jun 2018 12:00 - 12:20 at J1 room - Search-Based Software Engineering I Chair(s): Shin Yoo

We report and fix an important systematic error in prior studies that ranked classifiers for software analytics. Those studies did not (a) assess classifiers on multiple criteria and they did not (b) study how variations in the data affect the results. Hence, this paper applies (a) multi-criteria tests while (b) fixing the weaker regions of the training data (using SMOTUNED, which is a self-tuning version of SMOTE). This approach leads to dramatically large increases in software defect predictions. When applied in a 5*5 cross-validation study for 3,681 JAVA classes (containing over a million lines of code) from open source systems, SMOTUNED increased AUC and recall by 60% and 20% respectively. These improvements are independent of the classifier used to predict for quality. Same kind of pattern (improvement) was observed when a comparative analysis of SMOTE and SMOTUNED was done against the most recent class imbalance technique. In conclusion, for software analytic tasks like defect prediction, (1) data pre-processing can be more important than classifier choice, (2) ranking studies are incomplete without such pre-processing, and (3) SMOTUNED is a promising candidate for pre-processing.

Talk Presentation Slides (ICSE 2018_Smote.pptx)6.73MiB

Fri 1 Jun

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

11:00 - 12:30
Search-Based Software Engineering ITechnical Papers at J1 room
Chair(s): Shin Yoo Korea Advanced Institute of Science and Technology
11:00
20m
Talk
Testing Vision-Based Control Systems Using Learnable Evolutionary Algorithms
Technical Papers
Raja Ben Abdessalem SnT Centre/University of Luxembourg, Shiva Nejati SnT Centre/University of Luxembourg, Lionel Briand SnT Centre/University of Luxembourg, Thomas Stifter
Pre-print File Attached
11:20
20m
Talk
To Preserve or Not to Preserve Invalid Solutions in Search-Based Software Engineering: A Case Study in Software Product Lines
Technical Papers
Jianmei Guo Alibaba Group, Kai Shi
11:40
20m
Talk
Nemo: Multi-Criteria Test-Suite Minimization with Integer Nonlinear Programming
Technical Papers
Jun-Wei Lin University of California, Irvine, Reyhaneh Jabbarvand University of California, Irvine, Joshua Garcia , Sam Malek University of California, Irvine
Pre-print File Attached
12:00
20m
Talk
Is "Better Data" Better Than "Better Data Miners"?
Technical Papers
Amritanshu Agrawal North Carolina State University, Tim Menzies North Carolina State University
Link to publication DOI Pre-print File Attached
12:20
10m
Talk
Q&A in groups
Technical Papers