Table of Contents
StylePrinting VariablesBasic Print StatementsStandard Error StreamDebug TemplateGetting the Line NumberChecking for OOBUnspecified Evaluation OrderStress TestingAssertions & WarningsGCC Compilation OptionsWarning Options-WshadowOther Options-fsanitize=undefined-fsanitize=address -g-D_GLIBCXX_DEBUGDebuggersDebugging (Language-Specific)
Authors: Benjamin Qi, Aaron Chew, Aryansh Shrivastava, Owen Wang
Identifying errors within your program and how to avoid them in the first place.
Table of Contents
StylePrinting VariablesBasic Print StatementsStandard Error StreamDebug TemplateGetting the Line NumberChecking for OOBUnspecified Evaluation OrderStress TestingAssertions & WarningsGCC Compilation OptionsWarning Options-WshadowOther Options-fsanitize=undefined-fsanitize=address -g-D_GLIBCXX_DEBUGDebuggersResources | ||||
---|---|---|---|---|
AryanshS | Some parts were taken from here. | |||
LCPP | How to add print statements. |
Style
Resources | ||||
---|---|---|---|---|
CF | Contains many important gems. |
Printing Variables
Basic Print Statements
The most basic way that you might debug is adding a print statement. This is
great and serves the purpose for the most part. For instance, we can write the
below to check the value of x
at a point in our code.
C++
#include <iostream>using namespace std;int x = 10; // pretend this holds some important variablevoid dbg() {cout << "x is " << x << endl;}int main() {dbg(); // outputs 10x = 5000;dbg(); // now outputs 5000}
Java
public class Main {static int x = 10; // pretend this holds some important variablepublic static void main(String[] args) {dbg(); // outputs 10x = 5000;dbg(); // now outputs 5000}static void dbg() {System.out.println(x);}}
Python
x = 10 # pretend this holds some important variabledef dbg():print(x)dbg() # outputs 10x = 5000dbg() # now outputs 5000
Such print statements are great on a basic level, and we can comment or define them out of our main code when we need to compile and execute a more final version of our code.
However, as great as print statements are, they are annoying to work with and efficiently separate from the actual parts of our code. This is important for example when we want an online judge (OJ) to read our output.
Standard Error Stream
The standard error stream is a quick fix to this. Instead of printing in the standard output stream, we can print in a whole new stream called the standard error stream instead.
C++
#include <iostream>using namespace std;int x = 10;void dbg() {cerr << "x is " << x << endl;}int main() {dbg();x = 5000;dbg();}
Java
public class Main {static int x = 10;public static void main(String[] args) {dbg();x = 5000;dbg();}static void dbg() {System.err.println(x);}}
Python
import sysx = 10def dbg():print(x, file=sys.stderr)dbg() # outputs 10x = 5000dbg() # now outputs 5000
Try running this program and you might be confused about the difference. The content in the error stream appears right alongside that in the standard output stream. But this is the beauty of it! And the best thing about it is, if we submit this program to an OJ, it won't notice the output in the error stream at all!
Warning!
Printing too much content (even to the error stream) can cause TLE when submitting to an OJ.
C++
Debug Template
As C++ does not contain built-in print functions for many of its built-in data structures, it would be good to have some prewritten code to print them. This template is rather easy to use. It includes support for basically all of the needed data structures in competitive programming. Here's how you would use it:
#include <iostream>#include <vector>#include "debugging.h"using namespace std;int main() {vector<int> arr{1, 2, 3, 4};cout << arr << endl; // just feed it into cout like any other variable}
Warning!
You are not allowed to use pre-written code for USACO contests, so this template should only be used for other online contests.
Getting the Line Number
Sometimes, you'd like to know around which line your code is erroring at.
To print the line number, you can use the __LINE__
macro like so:
#include <iostream>using namespace std;int main() {cout << __LINE__ << endl; // outputs 5, the current line number}
Checking for OOB
C++ usually silently fails (or segfaults) when you access or write to a vector at an index that's out-of-bounds (writing to an invalid index is called buffer overflow).
For example, the following code doesn't behave as expected:
#include <bits/stdc++.h>using namespace std;int main() {vector<int> invalid_vec{1};vector<int> valid_vec{1234};cout << valid_vec[0] << "\n"; // outputs 1234for (int i = 0; i < 10; i++) {invalid_vec[i] = i;}cout << valid_vec[0] << "\n"; // errors}
To prevent this, you can use vector::at instead of vector::operator[].
If we use this in our following code segment like so:
#include <bits/stdc++.h>using namespace std;int main() {vector<int> invalid_vec{1};vector<int> valid_vec{1234};cout << valid_vec.at(0) << "\n"; // outputs 1234for (int i = 0; i < 10; i++) {invalid_vec.at(i) = i;}cout << valid_vec.at(0) << "\n"; // errors}
C++ will now check the bounds when we access the vectors and will produce the following output:
1234 terminate called after throwing an instance of 'std::out_of_range' what(): vector::_M_range_check: __n (which is 1) >= this->size() (which is 1) 1 zsh: abort ./$1 $@[2,-1]
If you want to find out the exact line at which this error occurs, you can use
a debugger such as gdb
or lldb
.
Unspecified Evaluation Order
Consider the following code stored in bad.cpp
:
#include <bits/stdc++.h>using namespace std;vector<int> res{-1};int add_element() {res.push_back(-1);return res.size() - 1;}
Compiling and running the above code with C++17 as so:
g++ -std=c++17 bad.cpp -o bad && ./bad
gives the intended output:
0 1 1 2 2 3 3 4 4 5
But compiling and running with C++14 like this:
g++ -std=c++14 bad.cpp -o bad && ./bad
gives:
0 -1 1 -1 2 3 3 -1 4 5
However, the code works correctly if you save the result of add_element()
to an
intermediate variable.
int main() {for (int i = 0; i < 10; ++i) {int tmp = add_element();res[i] = tmp;cout << i << " " << res[i] << "\n";}}
The problem is that res[i] = add_element();
only works if add_element()
is
evaluated before res[i]
is. If res[i]
is evaluated first, and then
add_element()
results in the memory for res
being reallocated, then res[i]
is invalidated. The order in which res[i]
and add_element()
are evaluated is
unspecified (at least before C++17).
See this StackOverflow post for some discussion about why this is the case (here's a similar issue).
You also may come across this issue when trying to create a trie.
Java
Python
Stress Testing
If your code is getting WA, one option is to run your buggy code against another that you're relatively confident is correct on randomly generated data until you find a difference. See the video for details.
Resources | ||||
---|---|---|---|---|
Errichto | Using a script for stress testing. | |||
Errichto | Contains some parts from the above videos. | |||
Benq | The script from the above video. |
C++
Here is the script that was mentioned in the video:
# A and B are executables you want to compare, gen takes int # as command line arg. Usage: 'sh stress.sh' for ((i = 1; ; ++i)); do # if they are same then will loop forever echo $i ./gen $i > int ./A < int > out1 ./B < int > out2 diff -w out1 out2 || break # diff -w <(./A < int) <(./B < int) || break done
We can modify this to work for other situations. For example, if you have
input and output files (ex. 1.in
, 1.out
, 2.in
, 2.out
, ..., 10.out
for
old USACO problems) then you can use the following:
# A is the executable you want to test for ((i = 1; i <= 10; ++i)); do echo $i ./A < $i.in > out diff -w out $i.out || break done echo "ALL TESTS PASSED"
The following will break on the first input file such that the produced output file is empty.
for((i = 1; ; ++i)); do echo $i ./gen $i > int ./A < int > out if ! [[ -s "out" ]] ; then echo "no output" break fi ; done
Warning!
This won't work if you're using Windows. Instead, you can use what tourist does:
:: save this in test.bat @echo off gen > in your_sol out correct_sol correct_out fc out correct_out if errorlevel 1 exit test
Java
Here is an script to test a Java program with input and output files. You will
need to put the .java
, this script, and the input and output files (1.in
,
1.out
, etc.) in the same directory:
Java testing script
If you want to learn how to write these scripts yourself, you can check here.
C++
Assertions & Warnings
Resources | ||||
---|---|---|---|---|
LCpp | Includes | |||
GCC | Talks about |
GCC Compilation Options
Resources | ||||
---|---|---|---|---|
CF | Includes all the options mentioned below. |
You can also check what options Errichto and ecnerwala use.
Warning Options
In this section we'll go over some extra compilations you can add to your
g++
compiling to aid in debugging. You can find the official documentation
for said options here.
Some other options that you might find helpful (but we won't go over) are the following:
-Wall -Wextra -Wshadow -Wconversion -Wfloat-equal -Wduplicated-cond -Wlogical-op
-Wshadow
Resources | ||||
---|---|---|---|---|
LCPP |
Avoid variable shadowing!
Other Options
Let's give some examples of what each of these do.
Warning!
These can slow down compilation time even runtime, so don't enable these when speed is of the essence (ex. for Facebook Hacker Cup).
Warning!
-fsanitize
flags
don't work with MinGW. If
you're using Windows but still want to use these flags, consider using
an online compiler (or installing Linux) instead.
-fsanitize=undefined
The following code stored in prog.cpp
gives a segmentation fault.
#include <bits/stdc++.h>using namespace std;int main() {vector<int> v;cout << v[-1] << endl;}
g++ prog.cpp -o prog -fsanitize=undefined && ./prog
produces:
/usr/local/Cellar/gcc/9.2.0_1/include/c++/9.2.0/bits/stl_vector.h:1043:34: runtime error: pointer index expression with base 0x000000000000 overflowed to 0xfffffffffffffffc zsh: segmentation fault ./prog
Another example with prog.cpp
is the following:
#include <bits/stdc++.h>using namespace std;int main() {int v[5];cout << v[5] << endl;}
g++ prog.cpp -o prog -fsanitize=undefined && ./prog
produces:
prog.cpp:6:13: runtime error: index 5 out of bounds for type 'int [5]' prog.cpp:6:13: runtime error: load of address 0x7ffee0a77a94 with insufficient space for an object of type 'int' 0x7ffee0a77a94: note: pointer points here b0 7a a7 e0 fe 7f 00 00 25 b0 a5 0f 01 00 00 00 b0 7a a7 e0 fe 7f 00 00 c9 8c 20 72 ff 7f 00 00 ^
-fsanitize=undefined
also catches integer overflow. Let prog.cpp
be the
following:
#include <bits/stdc++.h>using namespace std;int main() {int x = 1 << 30;cout << x + x << endl;}
g++ prog.cpp -o prog -fsanitize=undefined && ./prog
produces:
prog.cpp:6:15: runtime error: signed integer overflow: 1073741824 * 2 cannot be represented in type 'int'
We can also use -fsanitize=undefined
with -fsanitize-recover
. Error recovery
for -fsanitize=undefined
is turned on by default, but
The
-fno-sanitize-recover=
option can be used to alter this behavior: only the first detected error is reported and program then exits with a non-zero exit code.
So if prog.cpp
is as follows:
#include <bits/stdc++.h>using namespace std;int main() {cout << (1 << 32) << endl;cout << (1 << 32) << endl;cout << (1 << 32) << endl;}
then
g++ -fsanitize=undefined prog.cpp -o prog && ./prog
produces:
prog.cpp: In function 'int main()': prog.cpp:5:12: warning: left shift count >= width of type [-Wshift-count-overflow] 5 | cout << (1 << 32) << endl; | ~^~~~ prog.cpp:6:12: warning: left shift count >= width of type [-Wshift-count-overflow] 6 | cout << (1 << 32) << endl; | ~^~~~ prog.cpp:7:12: warning: left shift count >= width of type [-Wshift-count-overflow] 7 | cout << (1 << 32) << endl; | ~^~~~ prog.cpp:5:12: runtime error: shift exponent 32 is too large for 32-bit type 'int' 0 prog.cpp:6:12: runtime error: shift exponent 32 is too large for 32-bit type 'int' 0 prog.cpp:7:12: runtime error: shift exponent 32 is too large for 32-bit type 'int' 0
while
g++ -fsanitize=undefined -fno-sanitize-recover prog.cpp -o prog && ./prog
produces:
prog.cpp: In function 'int main()': prog.cpp:5:12: warning: left shift count >= width of type [-Wshift-count-overflow] 5 | cout << (1 << 32) << endl; | ~^~~~ prog.cpp:6:12: warning: left shift count >= width of type [-Wshift-count-overflow] 6 | cout << (1 << 32) << endl; | ~^~~~ prog.cpp:7:12: warning: left shift count >= width of type [-Wshift-count-overflow] 7 | cout << (1 << 32) << endl; | ~^~~~ prog.cpp:5:12: runtime error: shift exponent 32 is too large for 32-bit type 'int' zsh: abort ./prog
-fsanitize=address -g
Warning!
According to this issue, AddressSanitizer does not appear to be available for MinGW.
Resources | ||||
---|---|---|---|---|
GCC | documentation for -g, -ggdb |
The following code (stored in prog.cpp
) gives a segmentation fault.
#include <bits/stdc++.h>using namespace std;int main() {vector<int> v;cout << v[-1] << endl;}
g++ prog.cpp -o prog -fsanitize=address && ./prog
produces:
AddressSanitizer
For more helpful information we should additionally compile with the -g
flag,
which generates a file containing debugging information based on the line
numbering of the program. -fsanitize=address
can then access the file at
runtime and give meaningful errors. This is great because it helps diagnose (or
"sanitize" if you will) errors such as out of bounds, exceptions, and
segmentation faults, even indicating precise line numbers. Feel free to delete
the debug file after the run of course.
AddressSanitizer with -g
Another example is with prog.cpp
as the following:
#include <bits/stdc++.h>using namespace std;int main() {int v[5];cout << v[5] << endl;}
g++ prog.cpp -o prog -fsanitize=address -g && ./prog
produces:
AddressSanitizer with -g
-D_GLIBCXX_DEBUG
Resources | ||||
---|---|---|---|---|
GCC | documentation for -D_GLIBCXX_DEBUG |
The following prog.cpp
gives a segmentation fault.
#include <bits/stdc++.h>using namespace std;int main() {vector<int> v;cout << v[-1] << endl;}
g++ prog.cpp -o prog -D_GLIBCXX_DEBUG && ./prog
produces:
Debug
Java
Python
Debuggers
C++
Resources | ||||
---|---|---|---|---|
LCPP | ||||
Microsoft | ||||
Jetbrains |
Java
Resources | ||||
---|---|---|---|---|
Microsoft | ||||
Jetbrains |
Python
Resources | ||||
---|---|---|---|---|
Microsoft | ||||
Jetbrains |
Using a debugger varies from language to language and even from IDE to different IDE, so we will only go over the basics of a debugger.
A debugger allows you to pause a code in its execution and see the values as a given point in the debugger.
To do this, set a "breakpoint" at a certain line of code. When the code runs to that breakpoint, it will pause and you will be able to inspect all the different variables at that certain instance.
There are two more useful and common operations. Once you are at the breakpoint,
you may want to see what happens after the current line is executed. This would
be the "Step Over" button that will allow you to move to the next line. Say you
are at a line with the following code: dfs(0, -1)
, if you click "step over" the
debugger will ignore showing you what happens in this function and go to the
next line. If you click "step in," however, you will enter the function and be
able to step through that function.
In essense, a debugger is a tool to "trace code" for you. It is not much different from just printing the values out at various points in your program.
Pros of using a debugger:
- No need to write print statements so you save time
- You can step through the code in real time
Cons of using a debugger:
- You cannot see the overall "output" of your program at each stage. For example, if you wanted to see every single value of
i
in the program, you could not using a debugger. - Most advanced competitive programmers do not use debuggers; it is usually not very efficient to use one during a contest.