Optimizing ChatGPT’s image analysis
Artificial intelligence like ChatGPT has become a powerful tool in image analysis, but its performance often declines when dealing with low resolution, reduced color depth, blur, or noise. This project addresses these challenges through three key objectives which is ChatGPT capability evaluation,...
Saved in:
| Main Author: | |
|---|---|
| Format: | Final Year Project / Dissertation / Thesis |
| Published: |
2025
|
| Subjects: | |
| Online Access: | http://eprints.utar.edu.my/7136/1/fyp_CS_2025_CM.pdf http://eprints.utar.edu.my/7136/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Artificial intelligence like ChatGPT has become a powerful tool in image analysis, but
its performance often declines when dealing with low resolution, reduced color depth,
blur, or noise. This project addresses these challenges through three key objectives
which is ChatGPT capability evaluation, ChatGPT preprocessing optimization, and
image quality restore module implementation. Together, these steps aim to establish
clear performance limits for ChatGPT, extend its capability through restoration
techniques, and design a robust pipeline for real-world applications.
Firstly, ChatGPT capability evaluation was conducted to determine ChatGPT’s
thresholds for reliable image analysis. Systematic testing revealed that a minimum
resolution of 512px with 24-bit RGB color depth provided the most consistent balance
between accuracy and efficiency. Performance dropped sharply when resolution was
reduced to 256px or lower, while higher distortion levels of blur and noise above 15%
significantly impaired accuracy. Further analysis showed that bar and line charts were
more vulnerable to distortion than pie charts, highlighting differences in sensitivity
across visualization types. These experiments established precise thresholds for
resolution, color depth, blur, and noise, providing a baseline for effective and consistent
analysis.
Furthermore, the model preprocessing optimization and image quality restoration
methods were explored to optimize good inputs and restore degraded inputs. For images
already meeting the thresholds, optimization was performed to reduce computational
load. The optimization is using an HSV-based background removal technique, where a
saturation threshold of 15 effectively reduced file size while maintaining accuracy. For
degraded inputs, an image restoration module was developed. Comparative testing
demonstrated that PixelCut AI consistently outperformed DeepImage AI in deblurring,
while a trained DnCNN model exceeded morphology-based approaches in denoising,
particularly under high noise levels. These findings confirmed that advanced restoration
techniques can extend ChatGPT’s capacity to analyse images that would otherwise fall
below acceptable quality levels.
Moreover, to optimize or restore the image, the implementation was realized through
an integrated image processing pipeline designed to balance efficiency with reliability.
The pipeline begins with quality evaluation to assess resolution, color depth, blur, and noise. Good-quality images are optimized to reduce computational load, while
degraded inputs undergo restoration through resizing, deblurring, or denoising until
they meet the minimum thresholds. A validation stage then ensures all processed
images satisfy the required standards before being analysed by ChatGPT. This structure
allows the system to optimize clear images while reliably enhancing poor-quality
inputs, ensuring consistent results across varied conditions.
The project culminated in the development of the ChatGPT AI Vision Assistant,
implemented in Streamlit, which supports both text and image queries while integrating
the image processing pipeline and a replot function for dynamic chart visualization.
This system enables users to test quality thresholds, experience optimized analysis and
observe the benefits of restoration methods. Overall, the project defines precise
thresholds for resolution, color depth, blur, and noise, enhances degraded images with
optimized and trained restoration models, and implements a complete pipeline that
balances efficiency with reliability. Together, these outcomes deliver a robust
framework that significantly strengthens ChatGPT’s reliability in real-world image
analysis tasks. |
|---|
