Improved frequency table with application to environmental data

This paper proposes three different statistics to be used to represent the magnitude observations in each class when estimating the statistical measures from the frequency table for continuous data. The existing frequency tables use the midpoint as the magnitude of observations in each class, which...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohammed, M. B., Adam, Mohd Bakri, Zulkafli, Hani Syahida, Ali, Norhaslinda
Format: Article
Language:English
Published: Horizon Research Publishing 2020
Online Access:http://psasir.upm.edu.my/id/eprint/89315/1/TABLE.pdf
http://psasir.upm.edu.my/id/eprint/89315/
https://www.hrpub.org/journals/article_info.php?aid=8927
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.upm.eprints.89315
record_format eprints
spelling my.upm.eprints.893152021-09-03T08:44:31Z http://psasir.upm.edu.my/id/eprint/89315/ Improved frequency table with application to environmental data Mohammed, M. B. Adam, Mohd Bakri Zulkafli, Hani Syahida Ali, Norhaslinda This paper proposes three different statistics to be used to represent the magnitude observations in each class when estimating the statistical measures from the frequency table for continuous data. The existing frequency tables use the midpoint as the magnitude of observations in each class, which results in an error called grouping error. Using the midpoint is due to the assumption that the observations in each class are uniformly distributed and concentrated around their midpoint, which is not always valid. In this research, construction of the frequency tables using the three proposed statistics, the arithmetic mean, median, and midrange and midpoint are respectively named, Method 1, Method 2, Method 3, and the Existing method. The four methods are compared using root-mean-squared error (RMSE) by performing simulation studies using three distributions, normal, uniform, exponential distributions. The simulation results are validated using real data, Glasgow weather data. The findings indicated that using the arithmetic mean to represent the magnitude of observations in each class of the frequency table leads to minimal error relative to other statistics. It is followed by using the median, for data simulated from normal and exponential distributions, and using midrange for data simulated from uniform distribution. Meanwhile, in choosing the appropriate number of classes used in constructing the frequency tables, among seven different rules used, the freedman and Diaconis rule is the recommended rule. Horizon Research Publishing 2020 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/89315/1/TABLE.pdf Mohammed, M. B. and Adam, Mohd Bakri and Zulkafli, Hani Syahida and Ali, Norhaslinda (2020) Improved frequency table with application to environmental data. Mathematics and Statistics, 8 (2). 201 - 210. ISSN 2332-2071; ESSN: 2332-2144 https://www.hrpub.org/journals/article_info.php?aid=8927 10.13189/ms.2020.080216
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
language English
description This paper proposes three different statistics to be used to represent the magnitude observations in each class when estimating the statistical measures from the frequency table for continuous data. The existing frequency tables use the midpoint as the magnitude of observations in each class, which results in an error called grouping error. Using the midpoint is due to the assumption that the observations in each class are uniformly distributed and concentrated around their midpoint, which is not always valid. In this research, construction of the frequency tables using the three proposed statistics, the arithmetic mean, median, and midrange and midpoint are respectively named, Method 1, Method 2, Method 3, and the Existing method. The four methods are compared using root-mean-squared error (RMSE) by performing simulation studies using three distributions, normal, uniform, exponential distributions. The simulation results are validated using real data, Glasgow weather data. The findings indicated that using the arithmetic mean to represent the magnitude of observations in each class of the frequency table leads to minimal error relative to other statistics. It is followed by using the median, for data simulated from normal and exponential distributions, and using midrange for data simulated from uniform distribution. Meanwhile, in choosing the appropriate number of classes used in constructing the frequency tables, among seven different rules used, the freedman and Diaconis rule is the recommended rule.
format Article
author Mohammed, M. B.
Adam, Mohd Bakri
Zulkafli, Hani Syahida
Ali, Norhaslinda
spellingShingle Mohammed, M. B.
Adam, Mohd Bakri
Zulkafli, Hani Syahida
Ali, Norhaslinda
Improved frequency table with application to environmental data
author_facet Mohammed, M. B.
Adam, Mohd Bakri
Zulkafli, Hani Syahida
Ali, Norhaslinda
author_sort Mohammed, M. B.
title Improved frequency table with application to environmental data
title_short Improved frequency table with application to environmental data
title_full Improved frequency table with application to environmental data
title_fullStr Improved frequency table with application to environmental data
title_full_unstemmed Improved frequency table with application to environmental data
title_sort improved frequency table with application to environmental data
publisher Horizon Research Publishing
publishDate 2020
url http://psasir.upm.edu.my/id/eprint/89315/1/TABLE.pdf
http://psasir.upm.edu.my/id/eprint/89315/
https://www.hrpub.org/journals/article_info.php?aid=8927
_version_ 1710677180081504256
score 13.211869