2018: Judging Perspective

Judge’s Commentary
MathWorks Math Modeling Challenge 2018

Kelly Black, Ph.D.
Department of Mathematics
University of Georgia

Download this paper.

Introduction

Some broad observations about the students’ entries for this year’s MathWorks Math Modeling (M3) Challenge are provided. The topic for the 2018 Challenge was how to compare the amount of food wasted to the need associated with families who are not food secure. Overall, the quality of the entries continues to improve. We continue to see improvements, but from my personal anecdotal observations, the jump in improvements did not seem as great as it has in previous years.  The number of new teams taking part in the event was not as large as seen in previous years, and I suspect that the advisers and students are maturing with respect to how they prepare.

One area in which the rate of improvement seemed stagnant was in the analysis of the resulting models. Student teams appeared to be providing a detailed introspection of their models at a similar rate as last year. It is still necessary for teams to provide a critical view of their model to be able to proceed past the earlier judging rounds, and this remains a key ingredient that sets a paper apart from other entries.

One area of impressive improvement is the students’ response to the first question. It seemed to be more commonplace to read a good discussion and competent approach for the first question. More teams seemed to be able to examine and respond to the first question as well as provide a good, basic model describing the situation. It is exciting to see the event move into another stage where a larger number of teams are able to make good progress and provide a good response. It is clear that advisers continue to adapt and find new ways to support their students. As in previous years, their efforts and dedication help inspire us as we read the results of their service and support.

In this commentary I share some personal insights of my observations. The commentary is divided into three parts. In the first part some of the different approaches employed by the student teams are explored that are specific to the event this year. The second part is an examination of some broader modeling issues and how they relate to the student entries. Finally, some observations about writing and sharing technical information are discussed.

Food Waste and Re-purposing Food

The three questions in this year’s event relate to how food is used, wasted, or repurposed. The first question required teams to determine if enough food is wasted in Texas to relieve the demand among food insecure families. The second question required teams to quantify the food needs for different types of households, and the third question required teams to determine strategies for repurposing the maximal amount of wasted food while weighing the costs associated with different ways to do this. In this section I will briefly examine each of the three questions in order. Also, an overview of some of the approaches that were used as well as some of the reactions to those approaches are discussed.

The first question was the most straightforward of the three questions. Most teams provided good models that were relatively uncomplicated, and most provided some insight into the magnitude of the problem. Most teams made use of linear models that took the amount of food wasted per person multiplied by the population. Some teams refined their model by first determining how much food from different categories is purchased, and then determining the total amount of waste for each food category.

Teams generally assumed similar amounts of waste for each food category, and then determined the amount of waste for each. Most teams decided that different kinds of food waste could be used at the same rates. Teams that recognized that some food categories could not be easily reclaimed were able to demonstrate an important insight into the problem. For example, dairy products may spoil quickly and be difficult to use in other ways after a short time, while products based on oil seeds may last longer and be easier to use in other ways.

A large percentage of teams were able to make good progress on the first question. Many of the approaches made use of linear models and were relatively easy to understand. Teams that were able to provide a good explanation of their model including a good discussion of the individual terms and the motivation for the terms tended to make a more favorable impression on the judges. Some teams were able to go further and provide a good, basic analysis of their models. For example, a team that was able to successfully discuss the impact of a small percentage change in one of the parameters and compare the difference between an error in the slope versus an error in the intercept demonstrated that they know how to assess a model and explore the impacts of different parts of their model.

While most teams were able to provide a good response to the first question, the second question was more challenging. To our surprise, the second question turned out to be more challenging than the third question. The second question required teams to go back to their solution to the first question and make modifications based on demographic information. The teams made use of a wide variety of approaches, and a few broad, common trends emerged. To address the second question, teams had to decide how different aspects of the demographic information impacted personal behaviors, and they had to determine which relationships best predicted the personal choices people make. For example, some teams used annual incomes as the primary factor determining food use while other teams incorporated factors such as gender and age.

With respect to annual income, teams explored a wide variety of relationships. For example, some teams simply used a linear relationship with the idea that people spent more on food when they had more money available to spend. Other teams used high order polynomials so that they could obtain better regression statistics. Notably, some teams made use of a logistic function where the motivation is that once a family obtained enough food their food use would not increase, and a family’s food usage would reach a plateau.

Finally, the third question required teams to examine ways that excess food could be put to use to alleviate the needs associated with families living in food insecure situations. The solutions to the third question varied widely, and the teams’ interpretation of the question also varied widely. In particular, different teams interpreted the word “re-purpose” in different ways. For example, some teams decided that redistributing food to people who need it met the goals of the question. Other teams decided that to re-purpose food required that the food be used in a different way.

The judges made every effort to balance the different approaches to the third question. As long as a team made it clear how they interpreted the question and remained consistent with that interpretation then it was considered an appropriate response. Our primary concern is that teams provide a good analysis of their resulting model and justify their claims. For example, a team that discussed costs associated with the redistribution of foods to other people was given the same consideration as a team that focused on composting wasted food as a way to support local agricultural initiatives.

Modeling

We examine some of the general trends observed with respect to mathematical modeling. In particular we first note some general modeling issues such as the importance of units, and how regression is used to construct models. The next topics discussed are the interpretations of the production and distribution chain for food products followed by the use of diagrams and charts. Finally, a note is made about the process of modeling as an iterative cycle.

Units

The way units are used for variables is a recurring topic every year. It is a vital aspect of modeling in general, and professional mathematicians generally first ask about units when examining any new expression. It makes an immediate positive impression on the reader when the units are clearly stated and used in a consistent way. Units allow the reader to decide if the model is self-consistent, as well as help the reader understand how the teams interpreted the role of different parts of a mathematical expression.

For question one, teams generally made use of relatively simple linear models in appropriate ways. The primary difference was in how they quantified food use. For example, some teams focused on the number of required calories per day, others used kilograms per person, while others converted food requirements into units of dollars per week. Most teams gave considerable attention to the way they converted different food requirements into one consistent set of units, and it could be difficult to keep track of how the team addressed this aspect of the question.

Making the units clear allows the reader to interpret the role different variables play within a given expression. It also allows the reader to pick up a sense of scale. For example, if one team makes use of time units in years while another uses days, the reader can quickly gain an appreciation for how certain values have different impacts when trying to make comparisons across papers. When the units are not clear, however, the result is that the reader has to immediately begin hunting around different parts of the paper to try to understand how things fit together. This can disrupt the reader’s flow and increase the likelihood that a model will not be correctly understood by the reader.

Food Production and Distribution Chain

One of the ways that the issue of units manifested itself is in the students’ interpretation of the food production and distribution chain. As food is grown, processed, and distributed, there are losses within each step of the process. Some teams made use of percentages of foods lost within each stage, and other teams made use of total amounts of food lost with each stage.

The two different interpretations resulted in different kinds of models. For example, teams that made use of percentages were likely to multiply the percentages across each stage to get a total loss. Teams that made use of total amounts lost were likely to add the amounts lost at each stage. Many teams divided the stages within each category resulting in a hybrid model. For example, the stages within production were divided into different parts which required multiplying percentages but then total losses from distribution were added.

If a team was not clear in how they interpreted the connection between the way food was lost and how it manifested itself in the resulting model then it was more difficult to read and understand their efforts. On the other hand, teams that made their interpretation clear and made their units clear provided a much better context for the reader to understand their efforts. This generally places the team’s results in a much better light and makes a more positive impression.

Regression

As mentioned in the previous section, regression played an important role in this year’s event. For the previous couple of years teams were less likely to use regression techniques. In this year’s event, the data provided leads to the use of regression as a way to bind individual behaviors into how food is used and wasted. It is a credit to the teams and their advisers that students continue to improve in how to make use of this important tool in more nuanced and appropriate ways.

To expand on the example in the previous section, teams were asked to decide how food use is related to a family’s annual income. It was common to see a linear regression used to determine the specific relationship. A number of teams made use of high order polynomials, and noted that they resulted in better fits. This is problematic in that it implies that as income increases the amount of food purchased by a family could increase (or decrease) in a dramatic fashion.

Other teams used a linear model. If a team justified the model based on the idea that the more money people have available then the more they will spend on food, then that is a reasonable justification. A number of other teams noted that eventually the amount of food a family purchases will level out. The idea is that once a family’s nutritional needs are met they will not feel the need to continue to obtain more food. These kinds of insights provide important justifications for the choices that a team made and explicitly demonstrate that a team understands how models are motivated by the context first and then mathematical expressions are constructed to mimic the phenomenon of interest.

Finally, some teams went to great pains to construct models based on regression, but when they went back to make their calculations used the same values in the domain that were given in the tables previously used to construct the regression model. Some teams used values that could have been read directly from the table rather than from the equation resulting from a regression. The regression model can be of interest in this case, but only with respect to the analysis of the model. If the team does not provide insight into how the model gives different results for various changes in the model then it is better to simply use the tables directly.

Flow Charts and Diagrams

One way to share a mathematical model is to make use of a chart or a flow diagram. For example, using a flow chart to demonstrate how food moves through the different production, processing, and distribution phases makes it easier to understand the broad flow of how food is transferred to consumers. In prior years the use of diagrams was more prevalent. This year, however, I did not see as many diagrams.

For the first question, the food cycle is relatively linear, so the broader context is not as complicated as other situations. For question three, though, many teams constructed approaches with a number of different inter-related steps. Being able to visualize a team’s broader idea in a compact form can be helpful in understanding the team’s model.

Modeling as a Cycle

Another aspect that I saw less of this year is the idea that modeling is an iterative process. In practice, professional modelers in industry and academia generally start with the most modest model possible that still manages to capture the basic phenomena. The next step is to go back and evaluate and critique the model to determine if it provides additional insights into the phenomena of interest. Generally, the answer is no, and further improvements are required. Modelers must then go back, make changes, and examine the new model. This whole process is then repeated.

The constrained time for M3 Challenge is short and the situations are too complex to be able to do this. It is important, though, for a team to convey a sense of understanding of the larger process. We do not expect teams to be able to put together a complete and accurate model, and we hope that they start with something relatively modest as a way to gain insight into a complex and difficult situation such as addressing scarcity of food.

It is important to provide some basic analysis of the model and make decisions about which aspects of the model should be updated. This can take a number of different forms. The most basic form is to simply note the strengths and weaknesses of a model, which can take the form of a general discussion. Teams should briefly state which aspects of the model are good and should be maintained and compare those aspects with the other parts of the model that are not an accurate reflection of the situation.

Another way to analyze a model is to see if the model can be used to make predictions of situations that are known. For example, if the data is available for food use for one segment of a population then use the model to predict what will happen with the known group. A comparison can then be made between the prediction and the known values. Noting a disparity in the prediction and offering a way to improve the model makes a strong positive indication that the team has a deep understanding of the way models are constructed, examined, and improved.

Finally, another way to analyze a model is to see what happens to a model’s predictions when small changes are made. The values for different parameters can be difficult to determine, and their values can be different based on different assumptions. For example, in a linear model, a small change in the slope can result in a relatively large change in the predicted value for larger values of the input number. A small change in the y-intercept for a linear model can result in a relatively large change in the predicted value for smaller values of the input number. This kind of sensitivity analysis can offer important insights into the model itself and offers important clues as to which part of a model should be examined more closely.

Writing

In this last section the role of writing and expression is discussed. The importance of clear, concise writing cannot be overstated, and it is especially important in mathematical modeling. The expressions we use are made up quantities that are used to mimic complex physical phenomena. It is difficult to relate the expressions to the phenomena of interest, and it requires extra care and attention to ensure that our ideas are correctly shared. In this section, I focus on four topics. The first is writing equations and their associated justifications. The second is significant digits. The third is the role of the executive summary. The last topic is the importance of citations and references.

The first topic discussed is the way mathematical expressions are presented. It was not uncommon this year to read a paper in which mathematical equations were stated without much motivation or discussion. Simply writing down an equation does not convey much information to the reader. It is important to keep in mind that the expressions that a team uses are completely fabricated based on their imaginations. They are the products of long discussions that evolve over time after focused concentration and debate.

When we read a team’s paper we have no idea where the equations came from and how they evolved. It is vital that the team provide some description of what the different terms mean. We do not want to know the history of the expression or how the team struggled with the equation. Rather, we want to hear the team’s insights into the expression itself. For example, if a team decided to use a logistic function to model food requirements as a function of a person’s age then the team should provide some discussion as to how the food requirements for an individual should level out in time and not change once the person stops growing physically.

The second topic is the issue of significant digits. I read a number of papers that included numbers that were written out to six or more values beyond the decimal point. Such precision is not realistic. Students should be aware of the number of significant digits in the original data and try to remain consistent.

The teams work with very different kinds of functions, and exponential functions and linear functions result in different kinds of changes as their input values change which makes it difficult to determine the appropriate number of significant digits. We do not expect students to be absolutely precise in determining the appropriate number of significant digits, but some effort should be made to avoid over precise answers.

The third topic is the construction of the executive summary. This is a topic that comes up every year, and we have seen remarkable gains in the quality of the executive summary. This year was not an exception, and we continue to see students improve in this important aspect of the teams’ entries. An executive summary should at least include an overview of the problem, specific results, and provide some idea of the team’s approach to the problem. This year I can only recall a small number of entries that did not have the first two. Most were able to convey a broad sense of their approach which is a difficult task even for professional modelers. The advisers’ efforts are clearly evident, and they are doing a superb job of helping and preparing their students.

Another area that continues to see improvement is the use of both citations and references. Students continue to improve in providing the sources of their information in the reference section of their documents. Students are also doing a tremendous job of providing citations within the text to indicate which sources were used for their ideas. It is difficult for a team do produce a high ranking paper without a strong summary and without citations and references to indicate their sources.

Conclusions

We seem to be seeing a new phase in the event this year. The Challenge is a nationwide event, and student teams and advisers are beginning to mature in how they approach the event. The analysis of the mathematical models is showing slow improvement. On the other hand, the students’ abilities to make good progress on the questions continue to improve.

In this commentary some broad trends are given in how the teams approached the specific questions. Most importantly, many teams were able to put together good, relatively simple models for the first question. The second and third questions presented greater challenges. Also, some general modeling issues are discussed. Most notably, the way regression was used was markedly different this year. Finally, some broad issues with respect to writing are provided.

The students’ writing continues to show improvement. Even in the first rounds most reports are well written. In particular the executive summaries are generally well written overviews of the students’ work. It is tempting to take for granted that students will put together well written and well considered mathematical models, and at times we have to remind ourselves that the students are putting together remarkable papers that we would not have been able to do at their ages. We recognize that the teams and their advisers put a great deal of thought and care in preparing for the event, and we are particularly grateful for the dedication and service of the teams’ advisers. Thank you!

Acknowledgements

I am grateful for the help and insights of Kathleen LeBlanc and Michelle Montgomery from Society for Industrial and Applied Mathematics. Their editorial skills and keen eyes resulted in many improvements to this document. Their help is gratefully appreciated.