Feature selection (FS) is extensively studied in machine learning. We often need to compare two FS algorithms (A1, A2). Without knowing true relevant features, a conventional way of evaluating A1 and A2 is to evaluate the effect of selected features on classification accuracy in two steps: selecting features from dataset D using Ai to form D′i, and obtaining accuracy using each D′i, respectively. The superiority of A1 or A 2 can be statistically measured by their accuracy difference. To obtain reliable accuracy estimation, k - fold cross-validation (CV) is commonly used: one fold of data is reserved in turn for test. FS may be performed only once at the beginning and subsequently the results of the two algorithms can be compared using CV; or FS can be performed k-times inside the CV loop. At first glance, the latter is the obvious choice for accuracy estimation. We investigate in this work if the two really differ when comparing two FS algorithms and provide findings of bias analysis.