From Browser To Conscience:

From Browser To Conscience:
Internalizing Useful Attitudes Towards Software Correctness Using Distance Learning Technology

David Arnow and Gerald Weiss

Department of Computer and Information Sciences

Brooklyn College of CUNY

2900 Bedford Avenue

Brooklyn, NY 11210

Phone: (718) 951-5657

Fax: (718) 951-4406

{arnow,weiss}@sci.brooklyn.cuny.edu

Submission Type: Workshop Position Statement

Workshop Title: Establishing a Distance Education Program

Workshop Organizers: Helen M. Edwards and J. Barrie Thompson

Attitude toward software correctness

An appropriate attitude towards software correctness is a sine qua non for programmers and software engineers. Such an attitude is reflected in one’s approach to two important elements of the software development process:

Reading of specifications – should be critical and analytical

Testing– should be methodical and thorough, almost bordering on paranoia

Network-based systems for submitting and automatically checking programming assignments are typically found in distance learning environments. These systems can be extremely useful because their characteristics, including their impersonal nature, force the student to internalize the proper attitude towards software correctness.

Where do students start from?

Appropriate attitude towards software correctness is acquired. Students commencing a study of computer science display a broad range of naive views of correctness:

"If it compiles without errors, it’s correct."

"If it runs without crashing it’s correct."

"If it produces output it’s correct."

"If it produces correct output for a single case, it’s correct."

"If I do the easy cases, the instructor will never notice that the more difficult ones fail."

These may be humorous but the joke ends when these attitudes persist, in one form or another, in the workplace. Professional programmers know the meaning of the "dusty corners" of code where one is hesitant to stray.

How do we correct this?

Naturally, faculty attempt to counter these attitudes by advocating more realistic ones. However, the most effective learning is through experience. This is nowhere more true than in the case of attitudes. We would go so far as to say the only way to learn, that is, to internalize, an attitude is through experience. The question is, what are the experiences that will have the greatest impact?

We believe that the proper use of an automatic programming assignment checker can provide much of the necessary experience, in a way that traditional, manual grading cannot. In the first place, it is much easier to detect and reject incorrect homework submissions with an automatic system– this in itself increases the importance of program correctness in the student’s consciousness. Careful setup of checking systems make it possible to require that students pay attention to every aspect of the problem specification.

Secondly, a problem with manual grading is that in the absence of an ability to directly test the program, instructors may insist on an "adequate" set of tests provided by the student. Although it is important for students to learn to develop their own test suites, this arrangement can lead students to feel that the tests are an end in themselves– the way to satisfy the instructor is to provide a large number of tests. (Of course, the instructor is not in a position to evaluate more than a minimal number of tests in any case.) The point of testing– program correctness– is lost as an end to the student. By using an automated system for checking that simply rejects programs that fail any of its tests, the goal of program correctness remains primary, and testing is seen as a means to an end.

On the other hand, the instructor in a manual grading system, may elect to supply a test suite to the students rather than having them construct their own. The problem here is that the students do not learn how to test and worse, this approach undermines the notion of programming to specification– the students programs to the test suite.

Finally, manual grading inevitably examines many other issues– style, structure, design, documentation– many of which have a distinctly subjective component. Throwing correctness into the list erroneously gives the impression that correctness too has a subjective component. Worse, deducting 10% from an assignment on account of a boundary condition error sends to students precisely the wrong message concerning correctness.

The true value of the automatic program checker is that it simulates the action of the student’s program in a real life situation. Programs are not correct simply because they pass an instructor’s test suite, they must also interface properly with other software. Although the automatic program checker is in fact nothing more than an execution of the instructor’s test suite, it appears to the student as "an external real world context". Furthermore, the checker acting as a test driver, adds another dimension to the student’s program conforming to specification. It is no longer simply enough to write a function that provides the specified behavior, it must also do so with the appropriate interface. This is hard to achieve using a manually graded system.

WebToTeach

WebToTeach (http://wtt.sci.brooklyn.cuny.edu/) is a web-based, automatic homework-checking tool for computer science classes. Students using WebToTeach can get lists of programming exercises from their instructor. Each exercise comes with a set of instructions and an HTML form for submitting an answer. The answer may consist of one or more complete source or data "files", or just a fragment of code. Within a few seconds of submitting the form, the student gets a response. If the student’s answer passes all the WebToTeach tests, the answer is accepted. Otherwise, the answer is rejected.

This facility can be used in a variety of ways: drill, programming project, and test suite development. In the latter case, the student is given with a specification of an existing piece of code and is asked to provide an appropriate set of test cases. These test cases are then run against several versions of the code, only one of which is correct. The others suffer from various "classical" logic errors: typically boundary conditions and algorithm flaws. A correct submission is one that causes the valid version to succeed while exposing errors in each of the invalid codes.

Internalization

Why is it internalized? If there’s no one there to do it for you, you have to do it yourself.

Results/Experiences

Judging from the students’ reactions to the system– anger, frustration, resentment (all the emotions that make learning a positive experience) and eventual acceptance and even pride– we are confident that the experience is quite different from that of manual grading. Furthermore, we detect a marked change over the course of the semester in the way students view problem specification. It is not unusual, after a few assignments, to have students:

ask questions about boundary conditions

ask whether handling the (erroneous) case of alphabetic information in a numeric field is part of the specifications of an assignment

detect ambiguities in the specifications (intended by the instructor or otherwise!)

The students have gone beyond a critical analysis of the specification to an almost paranoid approach. Their view is that almost anything that might happen will happen.

We are not in a position to report decisively on long-term attitude changes concerning software correctness, but– anecdotally– many students report that interacting with this system was the single most important experience in their career here; they often cite the lessons concerning specification and testing as having specific relevance to their professional careers.

Position

Impersonal automated program testing– a near inevitability in the context of distance learning– is in fact an essential tool for all software

tc training education.