Rice D2K Capstone sponsors take flight with their data

Students turn real-world data into solutions for corporations, non-profits, startups and research groups.

Students holding binoculars and pointing

“It’s a win-win situation. Our data science students get to work on real-world data and problems, and our sponsors get customized software with comprehensive documentation that allows them to continue implementing the solution,” said Arko Barman, Faculty Director of the Capstone Program in the Data to Knowledge (D2K) Lab at Rice University.

“In the last several semesters, our students have helped answer data questions for repeat sponsors like the Houston Fire Department, the Houston Audubon Society, and Bill.com. We also work with one-time sponsors such as research groups in the Texas Medical Center and new entrepreneurs.  I know of at least one start-up that implemented their capstone solution and is doing quite well. Of course, our confidentiality agreements prevent me from disclosing additional details.”

As the program’s director, Barman manages confidentiality and intellectual property (IP) agreements, sponsored research approval, and other legal requirements that permit Rice students to access and work on actual data obtained from their sponsors in a secure environment for the duration of the project.

He said, “It is still a bit of a surprise when I see that our students can access raw data that needs to be cleaned and processed before it can be analyzed. When I was an undergraduate student, most of our courses were theoretical and the data provided for our projects had been cleaned. The kind of raw data provided by our capstone sponsors is usually only available to students completing internships with an organization or perhaps as part of a graduate student’s Ph.D. research.”

One of the projects and the students’ solutions were so novel that the team of capstone students, with the help of the sponsors and the supervision of Barman, published their methods and findings. The research, “Deep object detection for waterbird monitoring using aerial imagery,” was presented at the 21st IEEE International Conference on Machine Learning and Applications (ICMLA) in December 2022. 

“There are just not a lot of people in the world using drone images to count birds and animals using machine learning,” said Barman. “When the Houston Audubon Society first approached us with the problem, we were immediately intrigued.” 

“They were attending our data science consulting clinic, which we run as a public service. In the clinic, we listen to your challenge and propose approaches for analyzing your data. Once we’d proposed a solution, the Houston Audubon Society asked for help creating and implementing it, which is beyond the scope of the clinic. We suggested they turn their real-life problem into a capstone project.”

To narrow the focus, the Audubon sponsors chose imagery collected from Chester Island and North Deer Island by an experienced waterbird surveyor and drone pilot. When the students began researching similar wildlife surveys that leverage machine learning to kickstart their brainstorming session, they found very little being done with drone imagery and counting waterbirds with the help of machine learning. 

Barman said, “It was unexplored territory in many ways. All our capstone projects involve machine learning and other data analysis tools, and many of the projects also require some specialized knowledge or a background in an area like natural language processing or computer vision. The Houston Audubon Society project required students with a background in computer vision and deep learning.”

“The students on Team Audubon were a great match for the project, and their solution identified at least 16 unique species in the Chester Island drone images. The pipeline they built over two semesters can be applied by anyone with enough annotated data – whether it be on birds, animals, or any kind of wildlife. Drone imagery is ideal in areas too remote or too risky for human surveyors, or where the presence of humans scares off the animal or bird activity.”

Other capstone sponsors have been equally pleased with their results. In fact, the Houston Fire Department’s various capstone project results have captured the attention of their peers. In a recent D2K Capstone Showcase story, HFD accreditation manager Leonard Chan said,  “Since our partnership with the Rice D2K program, HFD has emerged as one of the pioneers in the use of fire service data. Fire departments across North America have been interested in the data Rice D2K is analyzing on our behalf and want to know what is next.”

Barman welcomes both returning and new sponsors each semester.  He said the timeline for a typical project begins with a sponsor’s inquiry and proposal submission three to five months in advance of a capstone assignment. Faculty members in the D2K Lab review each proposal and meet with potential sponsors to discuss their objectives and gauge the scope and feasibility of the project.

In addition to earning a grade for their work, the students present their project at the end of the semester in a D2K Capstone Showcase event. Judges, sponsors, and the public are invited to the event where students discuss their projects, explaining their challenges and solutions. The judges’ scores are used to identify the top three teams, with awards announced at the end of the evening.

A new sponsor, LivaNova, was matched with a student team that won the judges’ top score in the Fall 2022 D2K Capstone Showcase. Team LivaNova developed a statistical model to evaluate the battery life for an epilepsy treatment device by leveraging advanced biostatistics concepts. The next award went to Team Bill.com for their solution to identify and eliminate duplicate vendor records, and Team HFD won the final fall 2022 award for analyzing data to identify likely causes of traffic collisions impacting HFD vehicles. 

“Although only three teams are awarded certificates each semester, everyone is a winner,” said Barman. “Data science students have gained valuable experience working on real-world problems, and sponsors have gained insight into their data – along with software to continue analyzing it in the future.”

The popularity of Rice’s new data science minor and graduate programs means the D2K capstone program is in a rapid growth trajectory. Matching each capstone student with a real-world problem is both a joy and a challenge for Barman.

“The variety of projects our sponsors bring to our students is just amazing. This is experiential learning at its best; you can feel the excitement in the room as the students dive into their challenges,” he said. 

“But the growth in the program means we are constantly on the lookout for new and returning sponsors. We work with sponsors to determine if their goals are achievable in a 1- or 2-semester timeframe and to ensure the objectives are measurable so that we can evaluate the students’ performance in solving the problem. We also help sponsors frame their problem in a way that will appeal to Rice students. Get them engaged and just watch them fly with your data.”

For more information on the D2K Capstone projects and sponsorships, email d2k@rice.edu

D2K is now accepting project proposals for the Fall 2023 Capstone.

Pictured above: Students from the D2K capstone project with Audubon visiting Edith L. Moore Nature Sanctuary with one of the sponsors, Richard Gibbons. From left to right: Krish Kabra, Richard Gibbons, Alexander Xiong, Minxuan Luo, William Lu

RSVP now for Spring 2023 D2K Showcase on April 19 at Rice University.

Carlyn Chatfield, contributing writer

Related Articles

Team’s computer vision system tracks waterfowl, counts them from the air

Rice D2K Lab students build a predictive maintenance model to improve operations with machine learning.