You are on page 1of 9

With a Little Help From My Friends: An Empirical Study of

the Interplay of Students' Social Activities, Programming


Activities, and Course Success
Adam S. Carter and Christopher D. Hundhausen
Human-centered Environments for Learning and Programming (HELP) Lab
School of Electrical Engineering and Computer Science
Washington State University
Pullman, WA 99164 USA
+1 509-335-6602
cartera@wsu.edu, hundhaus@wsu.edu
ABSTRACT 1. INTRODUCTION
Computing education researchers have become increasingly In recent years, the ease with which learning process data can be
interested in leveraging log data automatically collected within collected, coupled with the availability of low-cost, high-power
computer programming environments in order to understand machines to store and process such data, has led to an explosion in
students' learning processes and tailor instruction to student needs. the field of learning analytics [29]. Computing education
While data on students' programming activities has been positively researchers, for instance, have collected data on students'
correlated with their learning outcomes, those data tell only part of programming processes, and have used those data to identify
the story. Another part of the story lies in students' social activities, programming difficulties and how they are resolved [2]. Similarly,
which, according to social learning theory, can also be predictive of researchers have created predictive models that relate students'
students' learning outcomes. In order to gain further insight into how programming behaviors to course outcomes [1, 7, 27]. While these
computing students' learning processes influence their learning models have achieved only modest predictive power, they do pave
outcomes, we present an empirical study that explores the interplay the way toward a future in which computing educators, with
of students' social activities, programming activities, and course minimal effort and involvement, can leverage continuous updates
outcomes in an early computing course. By analyzing log data on individual student learning processes and achievement.
collected through a programming environment augmented with a
social networking-style activity stream, we found that answers to Even though programming activities figure prominently in learning
questions posed through the activity stream were positively success, any approach that attempts to predict student success based
correlated with students' ability to make programming progress, and solely on programming activities may overlook other factors
their eventual success in the course. Based on our findings, we critical to students' success or failure. For example, according to
present recommendations for the design of pedagogical social learning theory, it makes sense to consider students' social
environments to support a more social programming process. activities: how they interact with their peers during the
programming process. Indeed, past research provides evidence that
Categories and Subject Descriptors a student's social behavior is a significant predictor of learning
D.2.8 [Metrics]: Performance measures, Process metrics, Product outcomes in computing courses [12]. This suggests that a broader
metrics. picture of student achievement might be obtained by examining the
K.3.1 [Computer Uses in Education]: Collaborative learning; interplay of a student's social behavior and programming activities.
K.3.2 [Computer and Information Science Education]:
In an effort to observe students' social and programming behaviors
Computer science education.
as they work on course programming assignments, we have
General Terms developed a social programming environment (SPE), which
Design, Experimentation, Human Factors. augments a traditional IDE with features commonly found in social
networking software [8]—most prominently, an activity stream in
Keywords which students can asynchronously communicate about their
Learning analytics, Predictive models of student performance and ongoing programming activities. Our SPE provides an empirical
achievement, Educational data mining, Activity streams, Social foundation for studying the interplay of social behavior,
networking. programming behavior, and student achievement in computing
courses. As a starting point for such an investigation, we propose
Permission to make digital or hard copies of all or part of this work for three basic research questions:
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies RQ1: What kinds of programming questions do students pose
bear this notice and the full citation on the first page. Copyrights for within an SPE?
components of this work owned by others than ACM must be honored.
Abstracting with credit is permitted. To copy otherwise, or republish, to RQ2: How does programming behavior influence social behavior
post on servers or to redistribute to lists, requires prior specific permission and vice versa?
and/or a fee. Request permissions fromPermissions@acm.org.
ICER '16, September 08-12, 2016, Melbourne, VIC, Australia RQ3: What is the relationship between students' social
© 2016 ACM. ISBN 978-1-4503-4449-4/16/09…$15.00 participation and course grades?
DOI: http://dx.doi.org/10.1145/2960310.2960322
In this paper, we present an empirical study that uses log data some notable exceptions [e.g. 10, 20]. In both cases, researchers
collected within our SPE to address these questions. We found that have yet to seriously capitalize on data generated from students'
students who pose questions, receive a suggestion, and make a online social interactions.
follow-up post were significantly more likely to have made positive
progress towards correct programming solutions. Furthermore, the 2.2 The Influence of Social Activities on
scores received for programming solutions produced by these Learning Outcomes
students were significantly more likely to exceed the class average. Students' online social interactions have been widely studied. For
Our study makes two key contributions to the computing education example, researchers have examined the impact of social
literature. First, while others have studied the relationship between participation on course performance [12] or on particular skills [e.g.
students' programming activities and course outcomes, this paper metacognition see 21]. Alternatively, researchers have created
presents the first-ever empirical study that uses automatically online social environments to facilitate pedagogically beneficial
collected log data to explore the interplay between students' activities such as peer code reviews [11] or to generate and critique
programming activities, social activities, and course outcomes. As test questions [9]. In these cases, researchers correlate usage of the
such, it presents a more rounded attempt to characterize the system with a proxy for course performance—often an exam or
relationship between computing students' learning processes and final grade [e.g. 11, 18]. In contrast to prior work, the research
learning outcomes. Second, our study contributes evidence-based presented in this paper directly studies the influence of students'
recommendations for the design of future pedagogical unstructured question-asking behavior on their future coding
environments to support a more social computer programming behaviors.
process in early computing courses.
2.3 Social learning Theory
2. BACKGROUND AND RELATED WORK This work is predicated on the theoretical frameworks of social
This research draws on two lines of related research work that (a) learning. Situated Learning Theory [17] holds that participation in
has used students' programming data to predict their learning a community of practice, which involves both the observation of
outcomes, and (b) has explored the influence of students’ social others and actual participation in community activities, facilitates
activities on their learning outcomes. Guided by social learning learning. Furthermore, Bandura's self-efficacy theory [4] posits that
theory, this research also builds on our prior effort to build a social students develop a positive sense of their own programming
programming environment to study the interplay of programming abilities (so-called self-efficacy) by being able to observe the
and social activities. Below, we review these lines of related and activities of their peers, by being able to evaluate themselves
foundational work. relative to their peers, and by engaging in their own performances.
Resonant with this theory, Rosson et al. [22] identified strong
2.1 Using Programming Data to Predict positive correlations between a number of attitudinal variables,
Learning Outcomes including self-efficacy, and a learner's orientation towards the
A large body of educational research has explored the extent to computing discipline. Both of these social theories of learning
which various learner variables are able to predict learning suggest that learners may have difficulties making progress if they
outcomes or future learning behaviors. These variables include the are forced to program in isolation from a broader community, as
learner's background [e.g. 6, 16], prior knowledge [e.g. 6, 19], has traditionally been the case in computing education. This is
cognitive abilities [e.g. 23], time-on-task [e.g. 24] and learning especially true with the type of individual programming
attitudes [e.g. 5]. While this line of research shares our interest in assignments that are common in early computing courses.
drawing associations between students and course outcomes, it
differs from our work in that it is based on written surveys
2.4 Social Programming Environments
Guided by these theories, our research lab has been investigating
administered periodically, rather than on continuous learning
the impact of social programming environments (SPEs; see Figure
process data automatically collected through a learning
1, next page) on learning processes and outcomes in early
environment.
computing courses. In previous work, we found that SPEs can have
A more recent line of research has focused on leveraging a a positive impact on students' sense of community [8], and that
continuous stream of data in order to identify patterns of learning regular participation in such an online social environment is
associated with positive learning outcomes. As such, it falls within positively correlated with course outcomes [12]. Furthermore, we
the emerging areas of educational data mining and learning found that data collected within our SPE can be used to construct
analytics [3, 25], which, in many STEM fields, have been used to predictive models of student success [7]. While our SPE has
gain insights into the processes that underlie student learning, and allowed us to collect the data necessary to explore the interplay
ultimately to better tailor instruction. A foundational idea is to build between programming behavior, social behavior, and student
learner models that infer learners' background knowledge, learning performance within the context of early computing courses, we do
strategies, and motivations from learning process data [19]. In turn, not claim that an SPE is a necessary component for facilitating
such models are used to adapt instruction to learner needs. online social discourse within a classroom. Indeed, in past research,
we have studied computing students’ social discourse within a
Duval and Verbert [26] categorize two approaches for the
variety of online environments, including Facebook and more
application of learning analytics: one that focuses on identifying
traditional learning management systems [12, 13].
patterns of behavior, and another that focuses on deriving
interventions aimed at improving the learning process. In 3. STUDY METHODOLOGY
computing education, the former approach has been used to
describe compilation behavior [e.g. 2, 15] and to identify behaviors 3.1 Participants
associated with eventual success or failure in a course [e.g. 1, 7, 14, Participants in this study were enrolled in the spring 2014 offering
28]. In computing education, the latter approach (developing of CS2 at Washington State University. Focusing on the C and C++
learning interventions) has been less explored, although there are programming languages, this 15 week course had three 50-minute
Figure 1: The OSBLE+ social programming environment

lectures and one 170-minute lab period per week, three exams (two that was initiated by a programming question. This resulted in a
midterms and a final), and seven individual programming final sample of 93 threads.
assignments. The course enrolled 140 students, 129 of whom For each of the 93 threads of social activity in our sample, we
finished the course and received a grade. 108 of these students (100 carried out the analyses described in Table 1 in order to provide a
men, 8 women) consented to releasing their learning process data foundation for addressing our research questions.
(automatically collected within the SPE) and course grades data for
this study. In order to categorize question content (#1 in Table 1), we extended
a previously used content coding scheme (see [13]), which we
3.2 Data Collection, Sampling, and Analysis present in Table 2. To verify the validity of this coding scheme for
Method the present analysis, the two authors independently coded a 20%
sample of the corpus (n = 19 posts), attaining an overall agreement
Our SPE (see [8]) was used to collect students' online social activity
of 88% (0.83 kappa). Having established a sufficiently high level
and programming behaviors. The SPE collected 21,952,494 points
of inter-rater reliability for this coding scheme, the first author
of interaction and 10,720 discussion posts. In an attempt to make
coded the remaining posts.
the data manageable, we employed a principled approach to
sampling the content. To this end, we randomly sampled 10
students who received each of the five possible course grades (A,
B, C, D, and F). While we were able to sample 10 students from Table 1. Analyses of Programming Questions
each of the A, B, and C levels, only four students were available
# Description of Analysis Relevant RQs
who received D's in the course, and only two students were
available who received F's. Therefore, our sample could include 1 Categorize question content RQ 1
only 36 students, instead of the 50 that we targeted. 2 Determine whether question related to question RQ 2, RQ3
author’s current build state
Having chosen our sample, we next identified the 2,352 instances 3 Determine whether responses to question were RQ 2, RQ3
in which students in our sample were involved in some kind of posted
social activity—either a post or reply on the SPE's activity stream. 4 Determine whether programming suggestions to RQ 2, RQ3
Since our research questions related to the interplay of social and address question were posted
programming behaviors, we opted to focus only on social activities 5 Determine whether question author explicitly RQ 2
(posts and replies) that had to do with programming activities. This acknowledged suggestions (with, e.g., “thank
refinement yielded 461 posts. Finally, in order to focus our study you”)
even more intently on our research questions, we further pruned the 6 Determine whether question author resolved the RQ 2
sample to include only posts and replies that were part of a thread question in a future build
Table 2. Programming Question Content Categories In order to increase the likelihood that any progress made by
question authors was influenced by their interaction in the
Category Description Example
corresponding activity stream thread, we considered subsequent
COMPILE Question relates to an “After I debug my builds that occurred within a reasonable time—two days—of the
issue encountered code, I got this error,
during program question author's last post to the thread. However, given that some
can anyone explain it to
compilation. me? thank you Error 1 students compiled their code infrequently, we required that a
error LNK2019: minimum of five compilations be considered. In some cases,
unresolved external examining a minimum of five compilations led us to consider
symbol […]”
compilations beyond the two-day window.
IDE Question relates to the “Okay, does anybody else
operation of Visual consistently get the
Two caveats come with the analysis approach just described. First,
Studio problem where cin and because it requires evidence that (a) programming questions be
cout are underlined with related to programming context, and (b) programming progress fall
red and VS gives you the within a reasonable time after related programming questions are
error "cout is
ambiguous"?” answered, our analysis approach may be seen as overly
conservative. For example, it would have been possible for students
IMPLEMENTATION Question asks for tips “If both the player and
on how to best to have asked a related coding question before they started coding
the computer draw the
implement an algorithm same type of hand(say their solution. Likewise, it would have been possible for students
or function. This is two pairs), who wins?” to have made positive strides towards a correct solution outside of
often, but not always,
related to the
our two-day, five-build window. Thus, it is possible that our
requirements of a given analysis approach failed to identify some relationships between
lab or assignment. programming and social behavior.
LANGUAGE Question asks about the “Does anyone know if the Second, just because we are able to find evidence of positive
C/C++ language, or strtok() keeps the old
about a programming string or does it fully erase programming progress that closely follows related social activity,
issue related to the it?” this does not mean that such progress was made because of the
misunderstanding of social activity; indeed, such an assumption would fall prey to the
syntax.
post hoc, ergo proctor hoc fallacy. Clearly, our study is
RESOURCES Question requests What did you guys use correlational, not causal. These two caveats should be borne in
external programming to create your UML
resources or tips. mind in interpreting the results that follow.
diagrams?

RUNTIME Question relates to an Does anyone know what 4. RESULTS


issue encountered "vector subscript out of Using the categories of Table 2, Figure 2 presents a high-level
during the runtime range" means and to fix breakdown of the content of the 93 questions in our sample. In this
execution of a student's it?
code figure, the bars represent the percentage of questions that fall into
each content category; all bars add up to 100%. The shading of each
In order to determine whether a question was related to the student's bar indicates the proportion of questions in each content category
current build state (#2 in Table 1), we examined the build that came that were related to the question author's immediate coding context.
immediately before or immediately after the question was posed. As this figure suggests, a strong majority of the questions (74%)
We considered these builds to be related to the question only if we focused on IMPLEMENTATION (46%) or LANGUAGE (28%). The least
could find clear evidence connecting the question to the state of the common questions focused on COMPILE (4%) and IDE (4%) issues.
code in the build. For example, if a student's question asked about Moreover, we see that questions having to do with RUNTIME,
substrings, the student's previous or subsequent build would need COMPILE, and LANGUAGE issues were most commonly associated
to contain code that uses substrings, attempts to use substrings, or with a student's active coding solution. In contrast, we see that none
calls a function that uses substrings. In cases in which both builds of the questions in the RESOURCES category was related to an active
occurred more than a day before or after a question was posed, we coding solution. This makes sense given that, by definition,
coded the question as not related to the current build state. RESOURCE questions ask for external programming resources or
tips.
Whether a given question had responses (#3 in Table 1) could be
gleaned directly from a transcript of the activity stream. If at least 4.1 Relationship between Suggestions
one of the responses to a question suggested concrete action that Received and Programming progress
the question author might take to address the question, we 82 of the 93 questions (88%) received a response; Slighter fewer
concluded that a suggestion was present (#4 in Table 1). Evidence (75, or 81%) received a response containing a programming
of a question author's acknowledgment of a suggestion (#5 in Table suggestion to address the question. Did question authors'
1) came in the form of any follow-up post by the author indicating participation in the activity stream relate to their subsequent ability
that the author had read the post (e.g., "Thank you" or "What to make programming progress? To explore this question, Figure
about…"). 3 considers the extent to which we found evidence that question
Finally, in order to determine whether the question author altered authors made positive progress when they (a) did not receive a
his or her code in a future build so as to resolve the original question suggestion, (b) received a suggestion but did not acknowledge the
posed (#6 in Table 1), we looked for clear evidence that the response, or (c) received a suggestion and acknowledged the
student's code had moved towards a resolution of the issue. For response. This figure illustrates a clear increase in success rates:
example, if the student asked a question about how to resolve a only 38% of question authors made progress without a suggestion,
syntax error with using substring in her program, the error versus a success rate of 85% when authors received and
mentioned in the question would need to be resolved in a acknowledged a suggestion from another student.
subsequent build.
According to chi-squared test, there was a significant association
between the type of feedback received (no suggestion, suggestion 50%
+ no acknowledgement, suggestion + acknowledgement) and
40%
whether or not a future build demonstrated progress towards a
correct solution (2(2) = 7.80, p = 0.02). A post-hoc z-test revealed 30%
a significant difference (p < 0.05) between question authors who 20%
did not receive a suggestion and those who received and
acknowledged a suggestion. The middle condition—question 10%
authors who received a response but did not acknowledge the
0%
response—did not differ significantly from the other two
categories.
4.2 Relationship between Social Activity and
Course Outcomes
We next consider the relationship between discussions anchored in
code and course outcomes, in order to further explore whether Related Not Related
talking about code was a general indicator of success within the
course. We begin by examining the relationship between the
Figure 2: Percentage of questions in each content
number of programming questions posed by a student and both the
category, broken down according to whether or not they
student's final grade, and the average programming assignment
were related to immediate coding context
grade. We failed to find a significant relationship between the
number of questions posed and the student's final grade (F(1,18) =
100%
0.2, p = 0.66) or assignment average (F(1,18) = 0.2, p = 0.66). Next,
we compared students who posted coding questions to those who 80%
did not. Again, we did not find a significant relationship between
these groups and final grade (F(1,105) = 0.03, p = 0.87) or 60%
assignment average (F(1,86) < 0.01, p = 0.96).
40%
The data indicate that simply posting a question about code bears
no relationship to a student's overall class performance. However, 20%
it is possible that receiving programming suggestions on a given
assignment might have positively impacted the grade received for 0%
Without suggestion With suggestion With suggestion and
that assignment. To investigate this possibility, we examined the 23 acknowledgement
instances in which a student asked a question, received a
suggestion, and acknowledged the suggestion with a response. For Figure 3: Percentage of future builds that demonstrate
each instance, we compared the grade the student received on that progress based on social feedback received
assignment to the average grade of students that did not receive help
on the assignment. Figure 4 depicts this relationship. Out of these
23 observations, only three observations were below the class
average. When taken as a whole, we see that the average score
received by this group was 10% higher (88% vs 78%) than the class
average. A two-sample t-test with equal variance not assumed
found this difference to be statistically significant (t(23) = 2.65, p
< 0.01).
5. VIGNETTES
The prior section discovered statistically-significant positive
relationships between participation in online discussions about
code, coding progress, and course outcomes. In this section, we
consider a series of vignettes, transcribed directly from the activity
stream data collected in our study, that illustrate the interplay
between student's coding behavior and social interactions. In
exploring these vignettes, we see how rich discussions centered
around coding can positively impact a student's coding progress.
Likewise, these vignettes give us insights of what might happen Figure 4: Grades of students who received help on
when these kinds of discussions fail to materialize. assignment compared with those that did not for a given
assignment
5.1 Vignette 1: Multiple Inheritance
The concept of inheritance in object oriented programming has to compile when inheriting from a single parent, but cannot figure
recently been introduced in class and James1 would like to use out the syntax for multiple inheritance. He asks for help with the
inheritance in his game of Battleship. James would like to build a issue on the OSBLE+ activity stream:
subclass that inherits from multiple parents. He can get his program

1All names used in these vignettes are pseudonyms.


James: How do I make a class that is composed of two other Jessica: Thanks guys
classes? This is giving me an error: class Player : public Jessica continues to work diligently, and a day later, she has a
Board : public Stats working implementation of her function. However, not only did
Seeing James' question, Sharon provides a suggestion on how to Jessica help herself, she also helped Fred, who might otherwise
accomplish multiple inheritance: have been too shy to ask a question.
Sharon: Never done it myself, but this might help [url]… Try class 5.3 Vignette 3: Learning a New Concept
Player : public Board, public Stats In the latest homework assignment, Beau is required to implement
James sees Sharon's question, modifies his class definition as a basic factory pattern for creating different employee types. As this
suggested. His project compiles and he reports back his progress: concept was recently introduced, Beau has no experience writing
factories and is thoroughly confused. He uses OSBLE+ to reach out
James: That's it. Crisis averted!
for help:
Timothy, who is also toying with multiple inheritance, sees James'
Beau: Any ideas to write studentFromString, staffFromString,
conversation and confirms Sharon's suggestion:
facultyFromString? How do those function help the
Timothy: This works. I have for instance, 5 different boat classes, fromString funtion?
all of which are inherited from a mothership parent
Justin responds with an explanation of how the factory pattern
class…
works:
Once again, James thanks Timothy for his confirmation. In the final
Justin: The fromString function determines whether the
post of the conversation, Steven, an upperclassman, offers a
cautionary message regarding multiple inheritance: employee is student, staff, or faculty then calls
studentFromString, staffFromString, facultyFromString
respectively. In those functions it sets all the information
Steven: While you are all correct and multiple inheritance does it gets from the string then returns it back to the
work in c++. It can lead to a lot of errors, and stuff like fromString function. The fromString function then
the diamond problem… returns that.
Less than three hours after posing the question, James was back on Apparently, other students are, like Beau, also struggling to
track. understand how to implement a factory. Jessica writes:
Jessica: So a majority of the code should be in the Factory and
5.2 Vignette 2: Understanding Requirements not in main? […] What I'm having trouble with is
Jessica has started work on her homework assignment, but her
bringing it back into main [partial code snippet]
unfamiliarity with the game of poker is preventing her from making
progress on how to properly implement a required function. Unable Steven responds that in his case, the majority of his code is indeed
to make headway, she uses OSBLE+ to ask the class how about inside the factory. James follows this with a code snippet of how he
how the function should operate: "brought it back into main":
Jessica: In highCard(), it returns one card which has the highest Stephen: I just use: Employee *employ; and then in the while loop,
rank and the highest suit, right? I just want to make sure just write: employee = factory.fromString(temp);
since I am unfamiliar with [poker] cout<<employee->toString()<<endl; Hopefully, it can
help you.
Stacie responds first:
Sandwiched between the discussion between Justin, Jessica, and
Stacie: Your function wants to return the highest rank over the
Steven, Beau thanks the students for their suggestions. At the time
highest suit. However, if you had, say 2 jacks in your
Beau originally posed his question, he had not written any factory-
hand, it would return the Jack of Spades, before the Jack
related code. Less than a day later, Beau had implemented a fully-
of hearts. But say you had an Ace no matter what the suit
operational factory.
is it would return the ace over the jacks. Hope this makes
sense. 5.4 Vignette 4: Unacknowledged Question #1
Fred, who is also working on his poker assignment, asks a Like Beau from Vignette 3, Rachel is having trouble implementing
clarifying question: a factory for her homework assignment. Also like Beau, Rachel
decides to seek help on OSBLE+. Unfortunately, she accesses
Fred: Isn't high card just for a no hand combination? Meaning OSBLE+ a day after Beau's discussion. Therefore, the help she
2 jacks would make a pair, high card only returns in the seeks has fallen from the first page of the activity feed. Unaware of
instance of you have no hand right? the previous discussion, Rachel makes her own post asking for
Barry responds to Fred: implementation strategies:
Barry: It would only be relevant if you had a jack and the dealer Rachel: I'm not sure what I should do in employeeFactory class,
had a jack. Suits are only there to be compared if there is anyone can give me some ideas? Thank you.
a tie if both have the same high card ...and flushes of Unfortunately, Rachel's post does not receive the same attention as
course. Beau's post; she receives no responses. Without help, Rachel has to
Fred replies to Barry: go it alone. Over the course of the next two days (the maximum
Fred: I took the said suit function out, but I'm putting it back in window of observation), Rachel makes progress towards correctly
[…] implementing a factory. However, at the end, she has still not
implemented a fully working solution.
To close out the conversation, Jessica thanks the discussion
participants:
5.5 Vignette 5: Unacknowledged Question #2 solution. This makes sense, as each of these question types is likely
Sean encounters an issue related to IFNDEF/ DEFINE preprocessor to arise as these issues are encountered by students. In contrast, not
directives in his code. For some reason, large chunks of his code a single question about coding resources was related to the student's
are being categorized as an "inactive preprocessor block." active programming solution. Again, this makes sense: students
Confused, he poses his question on OSBLE+: who are looking for programming resources are probably still
formulating the problem and have not yet transitioned to writing
Sean: So, in one of my header files I accidentally hit some key actual code.
that made everything within that ifndef an "Inactive
Preprocessor Block" (when I minimize it says that)... so, The vignettes presented in the previous section illustrate some ways
what key did I hit and how do I reactivate it? in which social activity can influence coding behavior. In cases in
which questions generated a healthy back-and-forth between their
Unfortunately, nobody responds to his post. In an effort to solve his
authors and other students, authors were likely to ultimately fix
problem, Sean alters his preprocessor directives in a way that fixes
their issues. Furthermore, the vignettes illustrate the way in which
his immediate error, but also introduces a potential bug that may
discussions can attract students with similar issues. In this respect,
cause issues in the future.
we can see social activity as a mechanism for disseminating
5.6 Vignette 6: Unacknowledged Response knowledge and creating bonds. The results of a previous study (see
Jon would like to create an array of pointers to use in his homework [8]), in which students using our SPE exhibited a significant
implementation, but he cannot figure out how to properly initialize increase in sense of community, provide additional empirical
the data structure. Looking for guidance, Jon poses his question on support for this observation.
OSBLE+:
6.3 What Is the Impact of Social Participation
Jon: I'm trying to use an Employee ** employee1 in my main
function to hold all the employees in my .csv file.
on Students' Ability to Successfully
However there is no way of initializing that. So how Complete Their Assignments?
would I hold an array of employees in main so I can Our quantitative results indicate that social behavior can have a
access them whenever I want (i.e. for the paycheck)? significant influence on future coding behavior. When we
examined conversations that contained a suggestion and an
Tyler responds with a code snippet demonstrating how he had
acknowledgement by the post's author, we saw the likelihood of the
initialized his array of pointers:
author making progress towards a correct solution increase
Tyler: I did employee1 = new Employee*[100] to initialize my significantly. Without a suggestion, only 38% of question authors'
array of employees future compilations demonstrated evidence of progress. With an
Jon never responds. Inspecting Jon's code, one finds that he appears unacknowledged suggestion, this figure increased to 63%. And this
to have implemented a different solution strategy. Future builds of number increased to 85% when the author received and
Jon's solution do not include code related to 2D array initialization. acknowledged a suggestion. This result suggests that students who
notice a helpful suggestion are likely to incorporate the suggestion
6. DISCUSSION into their own code, and that by doing so, they increase their
We organize our discussion of the results around our original chances of developing a correct solution.
research questions.
In addition to helping students make positive solution progress,
6.1 What Kinds of Programming Questions asking questions and receiving help also led to statistically-
significant improvement in homework grades—at least for students
Do Students Pose Within an SPE? who asked a question, received a suggestion, and acknowledged the
Based on the content coding scheme presented in Table 2, nearly suggestion. This finding suggests that asking for and receiving help
three-quarters (74%) of the programming questions posed within on an assignment not only leads students towards a correct solution
the OSBLE+ activity stream related to either language or in the immediate sense; it also increases the likelihood of ultimately
implementation. The remaining questions had to do with runtime receiving a high grade on an assignment.
(10%), resources (8%), IDE (4%) and compilation (4%) issues.
Given that this study focused on a CS2 course, it is perhaps 7. DESIGN IMPLICATIONS
unsurprising that so few questions had to do with IDE and Having addressed our original research questions, we next consider
compilation issues, which students at the CS2 level likely had the implications these results might have for future pedagogical
already grappled with in previous computing courses. However, we environments that aim to support individual programming
found it somewhat surprising that such a small percentage of assignments in early computing courses. Given the positive
questions related to runtime issues (e.g., unexpected runtime association between social interaction and programming behavior
behavior or runtime exceptions), which are notoriously difficult. we identified, we believe that providing a space for students to
We speculate that social interaction in an activity stream could discuss programming problems should be a key feature of any such
provide a strong basis for addressing such issues. pedagogical environment. Moreover, the results presented here
6.2 How Does Programming Behavior provide guidance on how such pedagogical environments should be
designed to support social discussions. We consider three specific
Influence Social Behavior, and Vice design implications below.
Versa? First, as illustrated by our quantitative analysis, code-centric social
As illustrated in Figure 2, we found that at least half of all
interaction becomes more effective when a student receives a
programming questions relate to the author's most recent program
suggestion. The impact becomes significant when the question
solution at the time the question was posed. Questions related to
author acknowledges the suggestion. This finding implies that a
language, compile, and runtime issues were most commonly
pedagogical environment should specifically highlight unanswered
associated with the question author's current programming
questions. For example, a revised version of our SPE might have a success, how programming discussions can positively impact other
section of the activity feed dedicated to unanswered questions. students, and what happens when such discussions do not occur.
Second, to ensure that those who ask for help actually read answers The work presented in this paper provides a compelling foundation
to their questions, an effective pedagogical environment needs to for future work. First, we would like to broaden our analysis by
somehow bring potential solutions to the attention of question expanding our selection criteria to include programming questions
authors. To this end, the environment might contain a notifications that occur as a sub-thread of another post. Second, we would like
area that is updated whenever a new solution has been submitted to to explore how programming discussions impact students other
a question. Along these lines, it might be helpful to incorporate a than the question author. This includes both students who
mechanism for both marking a potential solution and for marking participate in the discussion and students who merely "lurk." Third,
the "best" solution, similar to what is currently provided by we would like to explore factors that contribute to question asking
StackOverflow (2012). and answering. For example, do students with high social capital
receive more responses? Alternatively, are successful students
Third, as highlighted by Vignettes 3 and 4, students may pose more likely to ask for and receive help? Does this create a divide
similar questions. Due to the ephemeral nature of an activity feed, between students who are capable of clearly formulating
students could easily be unaware that their current question has programming questions and those who are not? Finally, we would
already been asked and answered. This implies the need for like to explore the impact of the design modifications suggested by
students to be made aware of prior questions. Such awareness could the results of this study, in order to further harness the potential for
be provided through, for example, an ability to search past online social interaction to positively influence learning outcomes
questions, perhaps using hashtags common on social media sites. in early computing courses.
Alternatively, the environment could recommend other posts based
on the content of the question (e.g. "The following posts may be 10. ACKNOWLEDGMENTS
related to your current question"). Yet another approach could This project is funded by the National Science Foundation under
leverage the continuous nature of data logging to relate grant no. IIS-1321045. Carla DeLira assisted with the coding of the
conversations. For example, for a question related to a compiler activity stream logs. We are grateful to the participants in the Social
error, the pedagogical environment might be able to automatically Analytics Workshop at ICER 2015 for their suggestions on how to
suggest a discussion that was created by another student under pursue the analyses presented in this paper.
similar coding circumstances.
11. REFERENCES
8. LIMITATIONS [1] Ahadi, A. et al. 2015. Exploring Machine Learning Methods
It is important to underscore three limitations of the research to Automatically Identify Students in Need of Assistance.
presented in this paper. First, while we were able to identify when Proceedings of the Eleventh Annual International
students incorporated suggestions into their homework, our data Conference on International Computing Education
does not indicate how students actually used online discussions to Research (Omaha, NE, USA, 2015).
solve their programming issues. Understanding the process by [2] Altadmri, A. and Brown, N.C.C. 2015. 37 Million
which students incorporate feedback into their solutions would Compilations: Investigating Novice Programming Mistakes
require us to augment our log data with retrospective interview or in Large-Scale Student Data. Proceedings of the 46th ACM
survey or survey data—an important direction for future research. Technical Symposium on Computer Science Education
Second, it is possible that our study suffers from a sampling bias. It (Kansas City, MO, USA, 2015), 522–527.
is possible that the sample of students selected for this study are not [3] Baker, R.S.J. and Siemens, G. 2014. Educational data
representative of the entire dataset. Alternatively, the inclusion of mining and learning analytics. The Cambridge Handbook of
more A, B, and C students in our sample may have resulted in the Learning Sciences. Cambridge University Press. 253–
artificial significance findings. We can address the first threat by 274.
increasing the number of students in our sample; however, [4] Bandura, A. 1997. Self-efficacy: the exercise of control.
addressing the second threat would require us to gather more data. Worth Publishers.
[5] Bergin, S. et al. 2005. Examining the role of self-regulated
Lastly, while we established inter-rater reliability for the content learning on introductory programming performance. Proc.
coding scheme (see Table 2), inter-rater reliability was not 2005 ACM International Computing Education Research
established for the other judgments required to perform the data Workshop. ACM Press. 81–86.
analyses described in Table 1, item 6. We will need to address this [6] Bransford, J. et al. eds. 1999. How people learn: Brain,
shortcoming in future research. mind, experience, and school. National Academy Press.
9. CONCLUSION [7] Carter, A.S. et al. 2015. The Normalized Programming State
In this paper, we have empirically investigated the relationship Model: Predicting Student Performance in Computing
between programming activities and programming-focused activity Courses Based on Programming Behavior. Proceedings of
stream discussions within the context of individual programming the Eleventh Annual Conference on International
assignments in an early computing course. We began with a Computing Education Research (2015).
quantitative analysis that showed that questions posed by students [8] Carter, A.S. and Hundhausen, C.D. 2015. The Design of a
are likely to be related to a current coding issue. We also discovered Programming Environment to Support Greater Social
that students who received and acknowledged a suggestion to their Awareness and Participation in Early Computing Courses.
questions were significantly more likely to fix their issues. J. Comput. Sci. Coll. 31, 1 (Oct. 2015), 143–153.
Furthermore, these students were likely to turn in assignments [9] Denny, P. et al. 2011. PeerWise: Exploring Conflicting
whose scores were significantly higher than the class average. Efficacy Studies. Proceedings of the Seventh International
These findings were elucidated through a series of vignettes that Workshop on Computing Education Research (Providence,
illustrated how programming discussions can influence eventual Rhode Island, USA, 2011).
[10] Hartmann, B. et al. 2010. What would other programmers [20] Mujumdar, D. et al. 2011. Crowdsourcing suggestions to
do: suggesting solutions to error messages. Proc. 28th programming problems for dynamic web development
Conference on Human Factors in Computing Systems. languages. CHI ’11 Extended Abstracts on Human Factors
ACM. 1019–1028. in Computing Systems (Vancouver, BC, Canada, 2011),
[11] Hundhausen, C.D. et al. 2011. Online vs. face-to-face 1525–1530.
pedagogical code reviews: An empirical comparison. [21] Pifarre, M. and Cobos, R. 2010. Promoting metacognitive
Proceedings 2011 SIGCSE Symposium on Computer skills through peer scaffolding in a CSCL environment.
Science Education. ACM Press. 117–122. Computer-Supported Collaborative Learning. 5, (2010).
[12] Hundhausen, C.D. et al. 2015. Supporting programming [22] Rosson, M.B. et al. 2011. Orientation of Undergraduates
assignments with activity streams: an empirical study. Toward Careers in the Computer and Information Sciences:
Proceedings of the 46th ACM Technical Symposium on Gender, Self-Efficacy and Social Support. ACM
Computer Science Education (New York, 2015), 320–325. Transactions on Computing Education. 11, 3 (Oct. 2011),
[13] Hundhausen, C.D. and Carter, A.S. 2014. Facebook me 1–23.
about your code: An empirical study of the use of activity [23] Schunk, D.H. 2012. Learning theories: An educational
streams in early computing courses. Journal of Computing perspective. Merrill Prentice Hall.
Sciences in Colleges. 30, 1 (2014), 151–160. [24] Slavin, R.E. 2011. Educational psychology: Theory and
[14] Jadud, M.C. 2006. Methods and tools for exploring novice practice. Pearson Education.
compilation behaviour. Proce. Second International [25] U.S. Department of Education, Office of Educational
Workshop on Computing Education Research. ACM. 73– Technology 2012. Enhancing Teaching and Learning
84. through Educational Data Mining and Learning Analytics:
[15] Jadud, M.C. and Dorn, B. 2015. Aggregate Compilation An Issue Brief.
Behavior: Findings and Implications from 27,698 Users. [26] Verbert, K. and Duval, E. 2012. Learning Analytics.
Proceedings of the Eleventh Annual International Learning and Education. 1, 8 (2012).
Conference on International Computing Education [27] Watson, C. et al. 2014. No tests required: comparing
Research (Omaha, NE, USA, 2015). traditional and dynmaic predictors of programming success.
[16] Jeske, D. et al. 2014. Learner characteristics predict Proceedings of the 45th ACM Technical Symposium on
performance and confidence in e-Learning: An analysis of Computer Science Education (2014), 469–474.
user behavior and self-evaluation. Journal of Interactive [28] Watson, C. et al. 2013. Predicting Performance in an
Learning Research. 25, 4 (2014), 509–529. Introductory Programming Course by Logging and
[17] Lave, J. and Wenger, E. 1991. Situated Learning: Analyzing Student Programming Behavior. Proceedings of
Legitimate Peripheral Participation. Cambridge University the 2013 IEEE 13th International Conference on Advanced
Press. Learning Technologies (2013), 319–323.
[18] Luxton-Reilly, A. et al. 2012. The Impact of Question [29] Wise, A.F. 2014. Designing Pedagogical Interventions to
Generation Activities on Performance. Proceedings of the Support Student Use of Learning Analytics. Proceedings of
43rd ACM Technical Symposium on Computer Science the Fourth International Conference on Learning Analytics
Education (Raleigh, NC, USA, 2012). And Knowledge (New York, NY, USA, 2014), 203–211.
[19] Ma, W. et al. 2014. Intelligent tutoring systems and learning
outcomes: A meta-analytic survey. Journal of Educational
Psychology. 106, 2007 (2014), 901–918.

You might also like