We recently ran the Making Software Greener code analyzer across nine widely used Python libraries: boto3, django, fastapi, flask, jinja, numpy, pandas, pytorch, and requests. The goal was not to single out these projects, but to learn from real-world code that millions of developers rely on every day. By looking at what shows up most often, we can better understand which mistakes are easy to make—and where small improvements could lead to cleaner, faster, and more sustainable software.
What We Found
The analyzer flagged a mix of performance inefficiencies and general code issues. Here are the top results, ranked by how often they appeared:
Issue | Count |
---|---|
Using print(...) in library code (prefer logging ) | 3256 |
String concatenation in a loop (use list + ''.join(...) or io.StringIO ) | 2524 |
Broad except Exception (narrow to specific exceptions) | 1273 |
df.append(...) inside a loop (accumulate rows then use pd.concat ) | 569 |
Membership test on a sequence in a loop (prefer set for O(1)) | 542 |
subprocess.run(...) without a timeout | 165 |
subprocess.check_output(...) without a timeout | 160 |
open(...) not used as a context manager (use with open(...) as f: ) | 133 |
Unnecessary list materialization for max(...) (prefer a generator) | 127 |
Unnecessary list materialization for sum(...) (prefer a generator) | 104 |
Why It Matters
These issues aren’t catastrophic bugs. Most of the time, the code still works as intended. But they do add up:
- Performance: String concatenation in loops or appending to a DataFrame row by row can quietly slow things down. In large-scale systems, that wasted time can also mean wasted energy.
- Reliability: Broad exception handling can make it hard to trace problems or handle errors in a safe way.
- Maintainability: Using
print(...)
instead of logging makes it harder to control output, filter messages, or integrate with monitoring tools. - Resource use: Omitting timeouts in subprocess calls risks hanging processes, which can lock up resources longer than necessary.
Learning from Patterns
The fact that these issues show up so often in mature, widely used libraries highlights how easy they are to introduce. It’s a reminder that even experienced teams can fall into patterns that work “well enough” in the moment but have hidden costs over time.
Tools like the Making Software Greener analyzer are designed to surface these patterns so teams can spot them early. The benefit isn’t only cleaner code—it’s also more efficient and sustainable software in the long run.
Moving Forward
If you’re working on Python projects of your own, here are some simple practices to keep in mind:
- Use
logging
instead ofprint(...)
for better control and flexibility. - Favor list joins or
StringIO
over concatenation inside loops. - Be specific with exception handling.
- Collect DataFrame rows and use
pd.concat
once, instead of appending repeatedly. - Add timeouts to subprocess calls to prevent runaway processes.
- Use context managers (
with open(...)
) to ensure files are closed properly.
These small shifts in habit can make your code faster, more reliable, and easier to maintain—all while reducing wasted computation.
No responses yet