RECERCAT - Documents de treball (Estadística)
http://www.recercat.cat:80/handle/2072/48804
www.ub.eduSun, 01 Feb 2015 18:29:11 GMT2015-02-01T18:29:11ZThe Channel Imagehttp://www.recercat.cat:80/bitstream/id/26374/
http://www.recercat.cat:80/handle/2072/48804
A measure of stability as a criterion for the verification and analysis of simulation models
http://www.recercat.cat:80/handle/2072/49978
A measure of stability as a criterion for the verification and analysis of simulation models
Monleón Getino, Toni; Ruiz de Villa, Carmen; Ocaña i Rebull, Jordi
The aim of this study is to define a new statistic, PVL, based on the relative distance between the likelihood associated with the simulation replications and the likelihood of the conceptual model. Our results coming from several simulation experiments of a clinical trial show that the PVL statistic range can be a good measure of stability to establish when a computational model verifies the underlying conceptual model. PVL improves also the analysis of simulation replications because only one statistic is associated with all the simulation replications. As well it presents several verification scenarios, obtained by altering the simulation model, that show the usefulness of PVL. Further simulation experiments suggest that a 0 to 20 % range may define adequate limits for the verification problem, if considered from the viewpoint of an equivalence test.
http://www.recercat.cat:80/handle/2072/49978R commands secondary production
http://www.recercat.cat:80/handle/2072/171833
R commands secondary production
Gaudes Saez, Ainhoa; Ocaña i Rebull, Jordi; Muñoz Gràcia, Isabel
R commands to calculate the secondary production estimates using the size-frequency method after Hynes and Coleman (1968), Benke (1979) and Huryn (1996).
http://www.recercat.cat:80/handle/2072/171833Uso de la estadística en la oncología y la hematología
http://www.recercat.cat:80/handle/2072/49979
Uso de la estadística en la oncología y la hematología
Monleón Getino, Toni
En este artículo abordamos el uso y la importancia de las herramientas estadísticas que se utilizan principalmente en los estudios médicos del ámbito de la oncología y la hematología, pero aplicables a muchos otros campos tanto médicos como experimentales o industriales. El objetivo del presente trabajo es presentar de una manera clara y precisa la metodología estadística necesaria para analizar los datos obtenidos en los estudios rigurosa y concisamente en cuanto a las hipótesis de trabajo planteadas por los investigadores. La medida de la respuesta al tratamiento elegidas en al tipo de estudio elegido determinarán los métodos estadísticos que se utilizarán durante el análisis de los datos del estudio y también el tamaño de muestra. Mediante la correcta aplicación del análisis estadístico y de una adecuada planificación se puede determinar si la relación encontrada entre la exposición a un tratamiento y un resultado es casual o por el contrario, está sujeto a una relación no aleatoria que podría establecer una relación de causalidad. Hemos estudiado los principales tipos de diseño de los estudios médicos más utilizados, tales como ensayos clínicos y estudios observacionales (cohortes, casos y controles, estudios de prevalencia y estudios ecológicos). También se presenta una sección sobre el cálculo del tamaño muestral de los estudios y cómo calcularlo, ¿Qué prueba estadística debe utilizarse?, los aspectos sobre fuerza del efecto ¿odds ratio¿ (OR) y riesgo relativo (RR), el análisis de supervivencia. Se presentan ejemplos en la mayoría de secciones del artículo y bibliografía más relevante.; In this article we address the use and importance of statistical tools that are mainly used in medical studies in the field of oncology and hematology, but applicable to many other medical fields as both experimental and industrial. The objective of this paper is to present a clear and accurate statistical methodology to analyze the data obtained inrigorous studies and concisely in terms of working hypotheses planned by researchers. The measure of response to the treatment of choice in the study determines the statistical methods used for the analysis of data and sample size. Through proper application of statistical analysis and proper planning you can determine whether the relationship foundbetween exposure to treatment and outcome is casual or on the contrary, is subject to a random connection that could not establish a causal relationship. We have studied the major types of study designs on medical use, such as clinical trials and observational studies 3 (cohorts, case control studies, prevalence and ecological studies). It also presents a section on calculating the sample size of studies and how to calculate it, what statistical test should be used, the strength of the effect on odds ratio (OR) and relative risk (RR), survival analysis.
http://www.recercat.cat:80/handle/2072/49979Supplementary material to Ocaña, J., Sánchez, M.P., Sánchez, A. and Carrasco, J.L. ¿On equivalence and bioequivalence testing¿, SORT, 32(2), 2008
http://www.recercat.cat:80/handle/2072/49977
Supplementary material to Ocaña, J., Sánchez, M.P., Sánchez, A. and Carrasco, J.L. ¿On equivalence and bioequivalence testing¿, SORT, 32(2), 2008
Ocaña i Rebull, Jordi; Sánchez Olavarría, Maria Pilar; Sànchez, Àlex (Sànchez Pla); Carrasco Jordan, Josep Lluís
http://www.recercat.cat:80/handle/2072/49977Impact of incorrect assumptions on the covariance structure of random effects and/or residuals in nonlinear mixed models for repeated measures data
http://www.recercat.cat:80/handle/2072/49976
Impact of incorrect assumptions on the covariance structure of random effects and/or residuals in nonlinear mixed models for repeated measures data
El Halimi, Rachid; Ocaña i Rebull, Jordi
In this paper we analyse, using Monte Carlo simulation, the possible consequences of incorrect assumptions on the true structure of the random effects covariance matrix and the true correlation pattern of residuals, over the performance of an estimation method for nonlinear mixed models. The procedure under study is the well known linearization method due to Lindstrom and Bates (1990), implemented in the nlme library of S-Plus and R. Its performance is studied in terms of bias, mean square error (MSE), and true coverage of the associated asymptotic confidence intervals. Ignoring other criteria like the convenience of avoiding over parameterised models, it seems worst to erroneously assume some structure than do not assume any structure when this would be adequate.
http://www.recercat.cat:80/handle/2072/49976On the consequences of misspecifing assumptions concerning residuals distribution in a repeated measures and nonlinear mixed modelling context
http://www.recercat.cat:80/handle/2072/49975
On the consequences of misspecifing assumptions concerning residuals distribution in a repeated measures and nonlinear mixed modelling context
El Halimi, Rachid; Ocaña i Rebull, Jordi
In this paper we describe the results of a simulation study performed to elucidate the robustness of the Lindstrom and Bates (1990) approximation method under non-normality of the residuals, under different situations. Concerning the fixed effects, the observed coverage probabilities and the true bias and mean square error values, show that some aspects of this inferential approach are not completely reliable. When the true distribution of the residuals is asymmetrical, the true coverage is markedly lower than the nominal one. The best results are obtained for the skew normal distribution, and not for the normal distribution. On the other hand, the results are partially reversed concerning the random effects. Soybean genotypes data are used to illustrate the methods and to motivate the simulation scenarios
http://www.recercat.cat:80/handle/2072/49975Spectral analysis of the luteinizing hormone in the blood samples
http://www.recercat.cat:80/handle/2072/49974
Spectral analysis of the luteinizing hormone in the blood samples
Liutsko, Liudmila
Generally, medicine books are concentrated almost exclusively in explaining methodology that analyzes fixed measures, measures done in a certain moment, nevertheless the evolution of the measurement and correct interpretation of the missed values are very important and sometimes can give the key information of the results obtained. Thus, the analysis of the temporary series and spectral analysis or analysis of the time series in the dominion of frequencies can be regarded as an appropriate tool for this kind of studies.In this work the frequency of the pulsating secretion of luteinizing hormone LH (thatregulates the fertile life of women) were analyzed in order to determine the existence of the significant frequencies obtained by analysis of Fourier. Detection of the frequencies, with which the pulsating secretion of the LH takes place, is a quite difficult question due topresence of the random errors in measures and samplings, i.e. that pulsating secretions of small amplitude are not detected and disregarded. In physiology it is accepted that cyclical patterns in the secretion of the LH exist and in the results of this research confirm this pattern and determine its frequency presented in the corresponded periodograms to each of studied cycle. The obtained results can be used as key pattern for future sampling frequencies in order to ¿catch¿ the significant picks of the luteinizing hormone and reflect on time forproductivity treatment of women.; Diploma d'Estudis Avançats - Programa de doctorat en Estadística. 2008. Tutors: Martín Ríos Alcolea
http://www.recercat.cat:80/handle/2072/49974What trees tell us: dendrochronological and statistical analysis of the data
http://www.recercat.cat:80/handle/2072/49973
What trees tell us: dendrochronological and statistical analysis of the data
Liutsko, Liudmila
Trees are a great bank of data, named sometimes for this reason as the "silentwitnesses" of the past. Due to annual formation of rings, which is normally influenced directly by of climate parameters (generally changes in temperature and moisture or precipitation) and other environmental factors; these changes, occurred in the past, are"written" in the tree "archives" and can be "decoded" in order to interpret what hadhappened before, mainly applied for the past climate reconstruction.Using dendrochronological methods for obtaining samples of Pinus nigra fromthe Catalonian PrePirineous region, the cores of 15 trees with total time spine of about 100 - 250 years were analyzed for the tree ring width (TRW) patterns and had quite high correlation between them (0.71 ¿ 0.84), corresponding to a common behaviour for the environmental changes in their annual growth.After different trials with raw TRW data for standardization in order to take outthe negative exponential growth curve dependency, the best method of doubledetrending (power transformation and smoothing line of 32 years) were selected for obtaining the indexes for further analysis.Analyzing the cross-correlations between obtained tree ring width indexes andclimate data, significant correlations (p<0.05) were observed in some lags, as forexample, annual precipitation in lag -1 (previous year) had negative correlation with TRW growth in the Pallars region. Significant correlation coefficients are between 0.27- 0.51 (with positive or negative signs) for many cases; as for recent (but very short period) climate data of Seu d¿Urgell meteorological station, some significant correlation coefficients were observed, of the order of 0.9.These results confirm the hypothesis of using dendrochronological data as aclimate signal for further analysis, such as reconstruction of climate in the past orprediction in the future for the same locality.; Diploma d'Estudis Avançats - Programa de doctorat en Estadística. 2008. Tutor: Dr. Antoni Monleón
http://www.recercat.cat:80/handle/2072/49973Anàlisi factorial confirmatòria per a variables categòriques: Aplicació al qüestionari de discapacitat WHODAS-II
http://www.recercat.cat:80/handle/2072/49972
Anàlisi factorial confirmatòria per a variables categòriques: Aplicació al qüestionari de discapacitat WHODAS-II
Vilagut Saiz, Gemma
Objective: To describe the methodology of Confirmatory Factor Analyis for categorical items and to apply this methodology to evaluate the factor structure and invariance of the WHO-Disability Assessment Schedule (WHODAS-II) questionnaire, developed by the World HealthOrganization.Methods: Data used for the analysis come from the European Study of Mental Disorders(ESEMeD), a cross-sectional interview to a representative sample of the general population of 6 european countries (n=8796). Respondents were administered a modified version of theWHODAS-II, that measures functional disability in the previous 30 days in 6 differentdimensions: Understanding and Communicating; Self-Care, Getting Around, Getting Along withOthers, Life Activities and Participation. The questionnaire includes two types of items: 22severity items (5 points likert) and 8 frequency items (continuous). An Exploratory factoranalysis (EFA) with promax rotation was conducted on a random 50% of the sample. Theremaining half of the sample was used to perform a Confirmatory Factor Analysis (CFA) inorder to compare three different models: (a) the model suggested by the results obtained in theEFA; (b) the theoretical model suggested by the WHO with 6 dimensions; (c) a reduced modelequivalent to model b where 4 of the frequency items are excluded. Moreover, a second orderfactor was also evaluated. Finally, a CFA with covariates was estimated in order to evaluatemeasurement invariance of the items between Mediterranean and non-mediterranean countries.Results: The solution that provided better results in the EFA was that containing 7 factors. Twoof the frequency items presented high factor loadings in the same factor, and one of thempresented factor loadings smaller than 0.3 with all the factors. With regard to the CFA, thereduced model (model c) presented the best goodness of fit results (CFI=0.992,TLI=0.996,RMSEA=0.024). The second order factor structure presented adequate goodness of fit (CFI=0.987,TLI=0.991, RMSEA=0.036). Measurement non-invariance was detected for one of the items of thequestionnaire (FD20 ¿ Embarrassment due to health problems).Conclusions: AFC confirmed the initial hypothesis about the factorial structure of the WHODAS-II in 6factors. The second order factor supports the existence of a global dimension of disability. The use of 4of the frequency items is not recommended in the scoring of the corresponding dimensions.; Diploma d'Estudis Avançats - Programa de doctorat en Estadística, Anàlisi de dades i bioestadística. 2008. Tutors: Josep Fortiana i Jordi Alonso
http://www.recercat.cat:80/handle/2072/49972Estudi de l'estat de Salut autopercebut: Modelització de l'índex d'utilitat EQ-5D mitjançant un model tobit
http://www.recercat.cat:80/handle/2072/49971
Estudi de l'estat de Salut autopercebut: Modelització de l'índex d'utilitat EQ-5D mitjançant un model tobit
Vilagut Saiz, Gemma
Objective: Health status measures usually have an asymmetric distribution and present a highpercentage of respondents with the best possible score (ceiling effect), specially when they areassessed in the overall population. Different methods to model this type of variables have beenproposed that take into account the ceiling effect: the tobit models, the Censored Least AbsoluteDeviations (CLAD) models or the two-part models, among others. The objective of this workwas to describe the tobit model, and compare it with the Ordinary Least Squares (OLS) model,that ignores the ceiling effect.Methods: Two different data sets have been used in order to compare both models: a) real datacomming from the European Study of Mental Disorders (ESEMeD), in order to model theEQ5D index, one of the measures of utilities most commonly used for the evaluation of healthstatus; and b) data obtained from simulation. Cross-validation was used to compare thepredicted values of the tobit model and the OLS models. The following estimators werecompared: the percentage of absolute error (R1), the percentage of squared error (R2), the MeanSquared Error (MSE) and the Mean Absolute Prediction Error (MAPE). Different datasets werecreated for different values of the error variance and different percentages of individuals withceiling effect. The estimations of the coefficients, the percentage of explained variance and theplots of residuals versus predicted values obtained under each model were compared.Results: With regard to the results of the ESEMeD study, the predicted values obtained with theOLS model and those obtained with the tobit models were very similar. The regressioncoefficients of the linear model were consistently smaller than those from the tobit model. In thesimulation study, we observed that when the error variance was small (s=1), the tobit modelpresented unbiased estimations of the coefficients and accurate predicted values, specially whenthe percentage of individuals wiht the highest possible score was small. However, when theerrror variance was greater (s=10 or s=20), the percentage of explained variance for the tobitmodel and the predicted values were more similar to those obtained with an OLS model.Conclusions: The proportion of variability accounted for the models and the percentage ofindividuals with the highest possible score have an important effect in the performance of thetobit model in comparison with the linear model.; Diploma d'Estudis Avançats - Programa de doctorat en Estadística, Anàlisi de dades i bioestadística. 2008. Tutors: Josep Fortiana i Jordi Alonso
http://www.recercat.cat:80/handle/2072/49971