🏠 Home
🔭 About
📺 Programs
Overview
🧪 Open Source Research Experience
🧪 Summer of Reproducibility
🪺 Open Source Incubator Fellowship
🎓 Open Source Education
📚 Resources
📝 Blog
🎪 Events
osre24
Understanding Data Leakage in Machine Learning: A Focus on TF-IDF
Hello again! This is my final blog post, and I will be discussing the second material I created for the 2024 Summer of Reproducibility Fellowship. As you may recall from my first post, I am working on the Exploring Data Leakage in Applied ML: Reproducing Examples of Irreproducibility project with Fraida Fund and Mohamed Saeed as my mentors.
Kyrillos Ishak
Last updated on Sep 5, 2024
SummerofReproducibility24
AutoAppendix: Towards One-Click reproducibility of high-performance computing experiments
Hi everyone, I’m excited to wrap up the AutoAppendix project with our final findings and insights. Over the course of this initiative, we’ve worked to assess the reproducibility of artifacts submitted to the SC24 conference and create guidelines that aim to improve the standard for reproducible experiments in the future.
Klaus Kraßnitzer
Last updated on Sep 9, 2024
SoR
Reflecting on the ScaleRep Project: Achievements and Insights
Reproducing and validating fixes for throttling bugs in HDFS improved system stability and performance.
Shuang Liang
Last updated on Sep 2, 2024
SoR'24
Final Report: Stream processing support for FasTensor
Final Report: Stream processing support for FasTensor Project Description FasTensor is a scientific computing library specialized in performing computations over dense matrices that exhibit spatial locality, a characteristic often found in physical phenomena data.
Aditya Narayan
,
Bin Dong
,
John Wu
Last updated on Aug 31, 2024
GSoC'24
,
gsoc2024
,
osre2024
Final Blog: ML in Detecting and Addressing System Drift
Hello! I’m Joanna! I have been contributing to the ML in Detecting and Addressing System Drift project under the mentorship of Ray Andrew Sinurat and Sandeep Madireddy. My project aims to design a pipeline to evaluate drift detection algorithms on system traces.
Joanna Cheng
Last updated on Aug 31, 2024
osre24
,
reproducibility
Final Blogpost: Reproducibility in Data Visualization
Hello everyone! I’m Triveni, a Master’s student in Computer Science at Northern Illinois University (NIU). I’m excited to share my progress on the OSRE 2024 project Categorize Differences in Reproduced Visualizations focusing on data visualization reproducibility.
Triveni Gurram
Last updated on Sep 4, 2024
SoR
ORAssistant - LLM Assistant for OpenROAD
Introduction Hello! I’m Palaniappan R, an undergraduate student at BITS Pilani, India. Over the past few months, I’ve been working as a GSoC contributor on the LLM Assistant for OpenROAD - Model Architecture and Prototype project, under the mentorship of Indira Iyer and Jack Luar.
Palaniappan R
Last updated on Dec 6, 2024
chip-design
Final Blogpost: Drift Management Strategies Benchmark
Background Hello there! I’m William and this is my final blog for my proposal “Developing A Comprehensive Pipeline to Benchmark Drift Management Approaches” under the mentorship of Ray Andrew Sinurat and Sandeep Madireddy under the LAST project.
William Nixon
Last updated on Sep 9, 2024
Hardware Hierarchical Dynamical Systems
Hi everyone! I am Ujjwal Shekhar, a Computer Science student at the International Institute of Information Technology - Hyderabad. I am excited to share my work on the project titled “Hardware Hierarchical Dynamical Systems” as part of the Open Source Research Experience (OSRE) program and Google Summer of Code.
Ujjwal Shekhar
Last updated on Dec 6, 2024
Reproducing and addressing Data Leakage issue : Duplicates in dataset
Hello! In this blog post, I will explore a common issue in machine learning called data leakage, using an example from the paper: Benedetti, P., Perri, D., Simonetti, M., Gervasi, O.
Kyrillos Ishak
Last updated on Aug 24, 2024
SummerofReproducibility24
«
»
Cite
×