17db96d56Sopenharmony_ciThis document describes some caveats about the use of Valgrind with 27db96d56Sopenharmony_ciPython. Valgrind is used periodically by Python developers to try 37db96d56Sopenharmony_cito ensure there are no memory leaks or invalid memory reads/writes. 47db96d56Sopenharmony_ci 57db96d56Sopenharmony_ciIf you want to enable valgrind support in Python, you will need to 67db96d56Sopenharmony_ciconfigure Python --with-valgrind option or an older option 77db96d56Sopenharmony_ci--without-pymalloc. 87db96d56Sopenharmony_ci 97db96d56Sopenharmony_ciUPDATE: Python 3.6 now supports PYTHONMALLOC=malloc environment variable which 107db96d56Sopenharmony_cican be used to force the usage of the malloc() allocator of the C library. 117db96d56Sopenharmony_ci 127db96d56Sopenharmony_ciIf you don't want to read about the details of using Valgrind, there 137db96d56Sopenharmony_ciare still two things you must do to suppress the warnings. First, 147db96d56Sopenharmony_ciyou must use a suppressions file. One is supplied in 157db96d56Sopenharmony_ciMisc/valgrind-python.supp. Second, you must uncomment the lines in 167db96d56Sopenharmony_ciMisc/valgrind-python.supp that suppress the warnings for PyObject_Free and 177db96d56Sopenharmony_ciPyObject_Realloc. 187db96d56Sopenharmony_ci 197db96d56Sopenharmony_ciIf you want to use Valgrind more effectively and catch even more 207db96d56Sopenharmony_cimemory leaks, you will need to configure python --without-pymalloc. 217db96d56Sopenharmony_ciPyMalloc allocates a few blocks in big chunks and most object 227db96d56Sopenharmony_ciallocations don't call malloc, they use chunks doled about by PyMalloc 237db96d56Sopenharmony_cifrom the big blocks. This means Valgrind can't detect 247db96d56Sopenharmony_cimany allocations (and frees), except for those that are forwarded 257db96d56Sopenharmony_cito the system malloc. Note: configuring python --without-pymalloc 267db96d56Sopenharmony_cimakes Python run much slower, especially when running under Valgrind. 277db96d56Sopenharmony_ciYou may need to run the tests in batches under Valgrind to keep 287db96d56Sopenharmony_cithe memory usage down to allow the tests to complete. It seems to take 297db96d56Sopenharmony_ciabout 5 times longer to run --without-pymalloc. 307db96d56Sopenharmony_ci 317db96d56Sopenharmony_ciApr 15, 2006: 327db96d56Sopenharmony_ci test_ctypes causes Valgrind 3.1.1 to fail (crash). 337db96d56Sopenharmony_ci test_socket_ssl should be skipped when running valgrind. 347db96d56Sopenharmony_ci The reason is that it purposely uses uninitialized memory. 357db96d56Sopenharmony_ci This causes many spurious warnings, so it's easier to just skip it. 367db96d56Sopenharmony_ci 377db96d56Sopenharmony_ci 387db96d56Sopenharmony_ciDetails: 397db96d56Sopenharmony_ci-------- 407db96d56Sopenharmony_ciPython uses its own small-object allocation scheme on top of malloc, 417db96d56Sopenharmony_cicalled PyMalloc. 427db96d56Sopenharmony_ci 437db96d56Sopenharmony_ciValgrind may show some unexpected results when PyMalloc is used. 447db96d56Sopenharmony_ciStarting with Python 2.3, PyMalloc is used by default. You can disable 457db96d56Sopenharmony_ciPyMalloc when configuring python by adding the --without-pymalloc option. 467db96d56Sopenharmony_ciIf you disable PyMalloc, most of the information in this document and 477db96d56Sopenharmony_cithe supplied suppressions file will not be useful. As discussed above, 487db96d56Sopenharmony_cidisabling PyMalloc can catch more problems. 497db96d56Sopenharmony_ci 507db96d56Sopenharmony_ciPyMalloc uses 256KB chunks of memory, so it can't detect anything 517db96d56Sopenharmony_ciwrong within these blocks. For that reason, compiling Python 527db96d56Sopenharmony_ci--without-pymalloc usually increases the usefulness of other tools. 537db96d56Sopenharmony_ci 547db96d56Sopenharmony_ciIf you use valgrind on a default build of Python, you will see 557db96d56Sopenharmony_cimany errors like: 567db96d56Sopenharmony_ci 577db96d56Sopenharmony_ci ==6399== Use of uninitialised value of size 4 587db96d56Sopenharmony_ci ==6399== at 0x4A9BDE7E: PyObject_Free (obmalloc.c:711) 597db96d56Sopenharmony_ci ==6399== by 0x4A9B8198: dictresize (dictobject.c:477) 607db96d56Sopenharmony_ci 617db96d56Sopenharmony_ciThese are expected and not a problem. Tim Peters explains 627db96d56Sopenharmony_cithe situation: 637db96d56Sopenharmony_ci 647db96d56Sopenharmony_ci PyMalloc needs to know whether an arbitrary address is one 657db96d56Sopenharmony_ci that's managed by it, or is managed by the system malloc. 667db96d56Sopenharmony_ci The current scheme allows this to be determined in constant 677db96d56Sopenharmony_ci time, regardless of how many memory areas are under pymalloc's 687db96d56Sopenharmony_ci control. 697db96d56Sopenharmony_ci 707db96d56Sopenharmony_ci The memory pymalloc manages itself is in one or more "arenas", 717db96d56Sopenharmony_ci each a large contiguous memory area obtained from malloc. 727db96d56Sopenharmony_ci The base address of each arena is saved by pymalloc 737db96d56Sopenharmony_ci in a vector. Each arena is carved into "pools", and a field at 747db96d56Sopenharmony_ci the start of each pool contains the index of that pool's arena's 757db96d56Sopenharmony_ci base address in that vector. 767db96d56Sopenharmony_ci 777db96d56Sopenharmony_ci Given an arbitrary address, pymalloc computes the pool base 787db96d56Sopenharmony_ci address corresponding to it, then looks at "the index" stored 797db96d56Sopenharmony_ci near there. If the index read up is out of bounds for the 807db96d56Sopenharmony_ci vector of arena base addresses pymalloc maintains, then 817db96d56Sopenharmony_ci pymalloc knows for certain that this address is not under 827db96d56Sopenharmony_ci pymalloc's control. Otherwise the index is in bounds, and 837db96d56Sopenharmony_ci pymalloc compares 847db96d56Sopenharmony_ci 857db96d56Sopenharmony_ci the arena base address stored at that index in the vector 867db96d56Sopenharmony_ci 877db96d56Sopenharmony_ci to 887db96d56Sopenharmony_ci 897db96d56Sopenharmony_ci the arbitrary address pymalloc is investigating 907db96d56Sopenharmony_ci 917db96d56Sopenharmony_ci pymalloc controls this arbitrary address if and only if it lies 927db96d56Sopenharmony_ci in the arena the address's pool's index claims it lies in. 937db96d56Sopenharmony_ci 947db96d56Sopenharmony_ci It doesn't matter whether the memory pymalloc reads up ("the 957db96d56Sopenharmony_ci index") is initialized. If it's not initialized, then 967db96d56Sopenharmony_ci whatever trash gets read up will lead pymalloc to conclude 977db96d56Sopenharmony_ci (correctly) that the address isn't controlled by it, either 987db96d56Sopenharmony_ci because the index is out of bounds, or the index is in bounds 997db96d56Sopenharmony_ci but the arena it represents doesn't contain the address. 1007db96d56Sopenharmony_ci 1017db96d56Sopenharmony_ci This determination has to be made on every call to one of 1027db96d56Sopenharmony_ci pymalloc's free/realloc entry points, so its speed is critical 1037db96d56Sopenharmony_ci (Python allocates and frees dynamic memory at a ferocious rate 1047db96d56Sopenharmony_ci -- everything in Python, from integers to "stack frames", 1057db96d56Sopenharmony_ci lives in the heap). 106