call_peaks
Call peaks for the given Calling Cards data.
The kwargs parameter is used to pass additional arguments into underlying
functions. Currently, the following are configured:
- pranges_rename_dict: a dictionary that maps the column names in the
promoter data to the column names in the PyRanges object. This is used
to rename the columns in the PyRanges object after the promoter data
is read in. The default is {“chr”: “Chromosome”, “start”: “Start”,
“end”: “End”, “strand”: “Strand”}.
- join_validate: the validation method to use when joining the promoter
data with the experiment and background data. The default is
“one_to_one”.
- background_total_hops: the total number of hops in the background data.
The default is the number of hops in the background data, calculated from
the input background data file
- experiment_total_hops: the total number of hops in the experiment data.
The default is the number of hops in the experiment data, calculated from
the input experiment data file
- genomic_only: set this flag to include only genomic chromosomes in the
experiment and background. See read_in_
:param experiment_data_paths: path(s) to the hops (experiment) data file(s). If
multiple paths are provided, they will be concatenated, according to the
deduplicate
and genomic_only
flags, prior to processing. On the
concatenated data, however, the deduplicated
flag is set to False
, since
within each file file the data was deduplicated, if it was set to True
, and
in the concatenated data, multiple hops at the same location is meaningful.
:type experiment_data_paths: list
:param experiment_orig_chr_convention: the chromosome naming convention
used in the experiment data file.
:type experiment_orig_chr_convention: str
:param promoter_data_path: path to the promoter data file.
:type promoter_data_path: str
:param promoter_orig_chr_convention: the chromosome naming convention
used in the promoter data file.
:type promoter_orig_chr_convention: str
:param background_data_path: path to the background data file.
:type background_data_path: str
:param background_orig_chr_convention: the chromosome naming convention
used in the background data file.
:type background_orig_chr_convention: str
:param chrmap_data_path: path to the chromosome map file.
:type chrmap_data_path: str
:param deduplicate_experiment: If this is true, the experiment data will be
deduplicated based on chr
, start
and end
such that if an insertion
is found at the same coordinate on different strands, only one of those records
will be retained. see read_in_experiment_data
for more details.
:type deduplicate_experiment: bool
:param unified_chr_convention: the chromosome naming convention
to use in the output DataFrame.
:type unified_chr_convention: str
:return: a pandas DataFrame of promoter regions with Calling Cards metrics. :rtype: DataFrame
Source code in callingcardstools/PeakCalling/yeast/call_peaks.py
138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 |
|