-
Notifications
You must be signed in to change notification settings - Fork 120
Open
Description
Hi to the developers,
I don't understand why only for the clinical data processing, I get the following error:
read_xml.character(xmlfile) :
Start tag expected, '<' not found [4]
I have done the following;
query_clin <- GDCquery(project = "TCGA-BRCA",
data.type = "Clinical Supplement",
data.category = "Clinical",
data.format = "XML")
GDCdownload(query_clin, files.per.chunk = 20)
clinical_data <- GDCprepare_clinic(query_clin, clinical.info = "patient")
and it hits with that error above about 31% in.
I manually checked if there is a XML ill-formatting in the dowloaded files, but cannot seem to find any.
To check I tried
cd ~/BC_CosMx/GDCdata/TCGA-BRCA/Clinical/Clinical_Supplement
find . -type f -name "*.xml" | while read -r file; do first_char=$(head -c 1 "$file"); if [ "$first_char" != "<" ]; then
echo "<$(cat "$file")" > "$file"; fi; done
None of them returned as modified. SO the XML files are fine that means.
I have been doing all this on the latest R Studio and R studio terminal (checking XML format) supplied to me through a HPC cluster.
Please help me figure out what am I doing wrong. If it is soon, that would be great! I have a conference in a few days which I like to present some results of this in.
Many thanks,
Shani.
Metadata
Metadata
Assignees
Labels
No labels