This thesis examines model selection for clustered data. Such data are often modeled using random effects. Conditional Akaike information, when cluster specific inference is desired, was proposed in Vaida and Blanchard (2005) and used to derive a corresponding model selection criterion under linear mixed models. We extend the approach to general and generalized linear mixed models. Exact calculations are not available outside linear mixed models so we resort to asymptotic approximations. We show that under general linear mixed models with correlated errors, the number of effective degrees of freedom is equal to the trace of the usual 'hat' matrix plus the number of parameters in the error covariance matrix. Using it one can define a crude version of the conditional AIC (cAIC), which is known to be inaccurate due to the estimation of unknown variance parameters. We show however, that a simple ̀rule-of-thumb' correction performs nearly as well as an asymptotically unbiased cAIC counting for the unknown parameters, one which is difficult to compute without specific programming for each case of the error correlation structure. For generalized linear mixed models, we consider a bootstrap method in addition to the rule-of- thumb. Finally, we investigate non-parametric estimation of a mean when data are clustered. We consider smoothing via splines with either L1 or L2 penalization. These models may be written as penalized general linear mixed models, thus allowing the use of existing software. We apply our methods to functional MRI time courses from multiple subjects